--------------------------------------------------------------------------------
MEMORANDUM

TO:	Jesse Pritts, U.S. EPA	

FROM:	Cortney Itle, ERG
	
DATE:	January 20, 2016

SUBJECT:	Proposed Approach for Data Analysis and Quality Assurance Using Drillinginfo's (DI) Desktop[(R)] Well File Database (DCN CWT00368) 
	
	In support of the Centralized Waste Treatment (CWT) Study, ERG plans to characterize the oil and gas extraction industry with counts of existing wells by basin to provide a broad overview of the current size and scope of the industry. For this analysis, ERG developed a database of active oil and gas wells using Drillinginfo's (DI) Desktop[(R)] Well File Database, a nationwide database of all oil and gas wells. This memorandum describes ERG's analysis and is organized into the following sections:

      Section 1.0 provides background information about the DI Desktop[(R)] database and the purpose of this analysis.
      Section 2.0 discusses how ERG created the DI Desktop[(R)] Development database, which is used to create a subset of the DI Desktop[(R)] database containing only active oil and gas wells.
      Section 3.0 summarizes the results of the DI Desktop[(R)] database analysis and how they compare to literature.
      Section 4.0 describes ERG's quality assurance procedures.
      Section 5.0 lists the references used for this analysis.
 
Background Information
	Drillinginfo (DI) is an oil and gas research firm located in Austin, Texas. DI Desktop(R) is a comprehensive database generated by Drillinginfo that contains a record (i.e., row) for all oil and gas wells drilled in the United States.  Drillinginfo uses oil and gas databases maintained by individual state oil and gas agencies to create the DI Desktop(R) database.

	For this data analysis, ERG used a version of the DI Desktop(R) database downloaded on March 30, 2015. For the most part, this version of DI Desktop(R) reflects wells drilled as of 2014, but currency varies by state. State-level information, such as the oil and gas agency names, last production date, production start date, and update frequency (e.g., monthly, quarterly), are provided in Drillinginfo's Data Coverage table (Drillinginfo, 2015). 
Table 1 lists the data fields in DI Desktop[(R)]. Basic well data contained in DI Desktop[(R)] for each well includes: well API number, location, operator, and well trajectory. DI Desktop[(R)] also includes annual oil, gas, and produced water production per well. Note that DI Desktop[(R)] includes records for wells that are no longer active (i.e., shut in), underground injection wells that do not produce oil and/or gas, conventional oil and gas (COG) wells, CBM wells, and UOG wells; it does not, however, contain a field that distinguishes between COG and UOG wells. The fields ERG used in this analysis are marked with an asterisk. 
           	Table 1. Field Names and Descriptions in DI Desktop[(R)]
                                  Field Name
                                  Description
ENTITY_ID*
DI assigned ID unique to a given property. A well is referred to as a "property" in DI Desktop[(R)].
API_NO
API assigned number of a well on the property.
PROPERTY_TYPE
Property type (e.g., lease, unit, well, completion, other, unknown).
PRODUCTION_TYPE*
Production type (e.g., oil, gas, injection).
PROD_TYPE_CLASS*
Classification of production type into D&A (drilled and abandoned), gas, injection, O&G (oil and gas), oil, and other.
PROD_FLAG
Production flag to indicate whether the well should be producing liquids. This is "Yes" for "Gas," "Oil," and "O&G" production type classification.
LIQUID_PROD_TYPE
Liquid production type (i.e., unknown, condensate, or oil) based on the production type classification and well test data.
WELL_NAME
Operator assigned well/lease name of the property.
FIELD*
Field name the property is reporting from (production field)
CURR_OPER_NAME
Current operator name.
SPUD_DATE
Date drilling commenced on property.
COMMON_OPER_NAME
Corporate entity that is determined by DI to own the current operator.
LATITUDE_NAD27
Surface latitude the property is located in; for multi-well properties DI Desktop[(R)] picked a well to designate the location of the property, in NAD27 format. 
LONGITUDE_NAD27
Surface longitude the property is located in; for multi-well properties DI Desktop[(R)] picked a well to designate the location of the property, in NAD27 format.
LATITUDE_NAD83
Surface latitude the property is located in; for multi-well properties DI Desktop[(R)] picked a well to designate the location of the property, in NAD83 format. 
LONGITUDE_NAD83
Surface longitude the property is located in; for multi-well properties DI Desktop[(R)] picked a well to designate the location of the property, in NAD83 format. 
COUNTY*
County the property is located in.
FIPS_CODE
Federal Information Processing Standard (FIPS) county code based on county or GIS analysis using latitude and longitude.
DISTRICT*
District within a given state the property is assigned.
STATE*
State the property is located in.
EPA_REGION
EPA region the property is located in.
OFFSHORE
Offshore waters indicator.
RESERVOIR*
Reservoir, formation, zone, or pool that the property is reported as producing from.
BASIN*
The basin the property is located in.
FORMATION*
Formation that the property is reported as producing from. 
STATUS*
Current status of the well (e.g., active, inactive, shut in).
TOTAL_DEPTH
Total depth the well was drilled to.
PLUG_DATE
Date the well was plugged. Note: for instances where duplicate API numbers were combined, the maximum value was selected.
COMPLETION_DATE
Most recent completion date of the well.
COMPLETION_YEAR
Year of the completion date.
Well Trajectory
This is the configuration of the wellbore. Options include: H  -  Horizontal; D  -  Directional; V  -  Vertical; U  -  Unknown.
FIRST_PROD_DATE
First date of reported production for the property. Note: for instances where duplicate API numbers were combined, the minimum value was selected.
LAST_PROD_DATE
Last date production was reported for the property.
LATITUDE_BOTM
Bottom hole latitude of the property.
LONGITUDE_BOTM
Bottom hole longitude of the property.
SumOfLIQ[xx]
Annual oil production in barrels. A separate column is provided for each year from 2000 (i.e., "SUMOFLIQ00") through 2014 (i.e., "SUMOFLIQ14").
SumOfGAS[xx]
Annual Gas production in thousand cubic feet. A separate column is provided for each year from 2000 (i.e., "SUMOFGAS00") through 2014 (i.e., "SUMOFGAS14").
SumOfWTR[xx]
Annual produced water production in barrels. A separate column is provided for each year from 2000 (i.e., "SUMOFWTR00") through 2014 (i.e., "SUMOFWTR14").
PROD[xx]_FLAG
Yes/No flag indicating if oil and/or gas production was greater than zero for a given year. A separate column is provided for each year from 2000 (i.e., "PROD00_FLAG") through 2014 (i.e., "PROD14_FLAG").
ACTIVE_FLAG
Yes/No flag indicating whether or not a well is active based on production.
ACTIVE_PROD_FLAG
Yes/No flag indicating whether the entity (i.e., property) is active (using the ACTIVE_FLAG field) and had production in 2014 (using the PROD14_FLAG field).
* Indicates data field used in ERG's analysis.

Methodology 
The following steps describe how ERG used the raw DI Desktop[(R)] database to make a list and count of active oil and gas extraction wells by basin.  
 ERG created a query to select active oil and gas extraction wells and create a new table called "Active Oil and Gas Wells". This query checks each well record in the raw DI Desktop[(R)] database (i.e., the table "DI Desktop Wells") (Drillinginfo, 2015) against the criteria listed below to identify oil and gas wells that are active. 
 Query 100 only includes wells that are designated as "oil", "gas", or "O&G" in the PROD_TYPE_CLASS field. This removes wells that are classified as injection wells, carbon storage wells, water wells, etc.
 For all states, ERG considered active wells to be those that produced oil, gas, and/or produced water during the year that the data was last updated by Drillinginfo. Because each state updates their data at different times, ERG used different dates for each state to complete this step. The most recent production date for each state is provided in Drillinginfo's Data Coverage table (Drillinginfo 2015).
 ERG performed data review and correction on the "Active Oil and Gas Wells" table.  DI Desktop(R) identifies the basin, reservoir, and formation in which the well is completed. These fields include inconsistent naming conventions and spelling errors.  ERG resolved spelling errors by manually fixing each incorrectly spelled basin name, one by one. The issue of unknown basin names was fixed to the extent possible as follows: 
 The list of active oil and gas wells contained about 1100 wells with an "N/A", "0", "N", or blank as basin type. ERG created a query to extract the ENTITY_ID, RESERVOIR, BASIN, FORMATION, STATE, FIELD, DISTRICT, and COUNTY fields for each well with an "NA" basin type. ERG exported these data to excel.
 ERG updated as many unknown and "NA" basins in the excel file as possible, based on google searches as well as other references as needed. The excel table with updated unknown?? And "NA" basins was uploaded into the database as the "NA Cleanup" table, containing the ENTITY_ID, RESERVOIR, BASIN, FORMATION, STATE, FIELD, DISTRICT, COUNTY, and UPDATED BASIN for each well. 
 ERG performed an update query to update the "Active Oil and Gas Wells" table with the UPDATED BASIN column from the "NA Cleanup" table. The query was used to populate the UPDATED BASIN field in the "Active Oil and Gas Wells" table by matching the ENTITY_ID, RESERVOIR, BASIN, FORMATION, STATE, FIELD, DISTRICT, and COUNTY, combination against the same fields in the "NA Cleanup" table. After the data corrections, the number of wells with an unknown or NA basin type reduced to 850 (XX percent of wells).
 ERG created a query of the updated "Active Oil and Gas Wells" table to count the total number of wells covered under each basin type. 
Results
	This section summarizes the results of the DI Desktop[(R)] database analysis. The total number of wells that were active in 2014 may have been higher than the number ERG estimated. This is primarily because the BASIN field for many active oil and gas wells was reported as "N/A", "0", "N", or blank and ERG could not include them in the analysis (850 wells). 

	Table 2 lists the total number of wells covered under each basin type, as well as the states included in each basin type. There were a total of 1,119,098 wells included in this analysis. The Permian basin has the largest number of wells with a total of 295,308 wells in New Mexico and Texas. 

             Table 2. Total Number of Wells Covered by Each Basin

                                  Basin Name
                                Number of Wells
                                States Included
Permian
                                                                        295,308
States Included
Appalachian 
                                                                        187,981
NM, TX
Anadarko 
                                                                         93,855
AL, KY, MD, NY, OH, PA, TN, VA, WV
Texas and Louisiana Gulf Coast
                                                                         84,951
CO, KS, OK, TX
Ft. Worth
                                                                         79,210
LA, TX
East Texas
                                                                         49,268
TX
Arkla
                                                                         32,184
TX
Cherokee
                                                                         23,641
AR, LA, MS, TX
San Juan
                                                                         23,432
KS
Denver Julesburg
                                                                         22,908
CO, NM
Central Kansas Uplift
                                                                         20,891
CO, NE, WY
San Joaquin
                                                                         17,728
KS
Arkoma
                                                                         17,119
CA
Williston
                                                                         16,972
AR, NM, OK
Chautauqua Platform
                                                                         15,053
MT, ND, SD
Powder River
                                                                         13,654
KS, OK
Michigan
                                                                         11,966
MT, SD, WY
Uinta
                                                                         11,478
MI, OH
Piceance
                                                                         11,173
UT
Green River
                                                                         10,924
CO
Forest City
                                                                          9,811
CO, UT, WY
South Oklahoma Folded Belt
                                                                          8,258
KS, MO, NE
Sedgwick
                                                                          7,404
OK, TX
Nemaha Anticline
                                                                          6,017
KS
Ouachita Folded Belt
                                                                          5,522
KS
Black Warrior
                                                                          4,801
OK, TX
Mississippi and Alabama Gulf Coast
                                                                          3,800
AL, MS
Palo Duro
                                                                          3,626
AL, FL, LA, MS
Los Angeles
                                                                          3,602
OK, TX
Sweet Grass Arch
                                                                          3,582
CA
Raton
                                                                          3,569
MT
Wind River
                                                                          2,722
CO, NM
Big Horn
                                                                          2,698
WY
Las Animas Arch
                                                                          1,767
MT, WY
Central Western Overthurst
                                                                          1,764
CO, KS
Arctic Slope 
                                                                          1,502
WY
Illinois
                                                                          1,448
AK
Chadron Arch
                                                                          1,282
AR, KY
Paradox
                                                                          1,036
KS, NE
Central Montana Uplift
                                                                            918
CO, UT
NA
                                                                            850
MT
Ventura
                                                                            831
CA, FL, KS, LA, MI, NV, OR, TX, UT
Cincinnati Arch
                                                                            774
CA
Santa Maria
                                                                            630
KY, OH
Salina
                                                                            419
CA
Cook inlet
                                                                            305
CA, KS, NE
Sacramento
                                                                            275
AK
Great Basin
                                                                             58
CA
Black Mesa
                                                                             25
NV
North Park
                                                                             24
AZ
Gom- Shelf
                                                                             23
CO
Wasatch Uplift
                                                                             23
TX
Arctic Ocean, ST. 
                                                                             18
UT
Northern Coast PRVC
                                                                             13
AK
Eel River
                                                                              4
CA
Half Moon
                                                                              1
CA


QA/QC
All of the analyses presented in this memorandum adhered to the Environmental Engineering Support for Clean Water Regulations Programmatic Quality Assurance Project Plan (PQAPP) including collecting, evaluating, and analyzing existing data and information.
      ERG performed the following QC checks in conducting the analyses:
      
 ERG developed the database and analyses in Microsoft Access database. A team member knowledgeable of the project, but who did not develop the database or perform the analyses, reviewed the database, queries, and outputs to ensure the accuracy of data extracted into the database, the technical soundness of methods and approaches used, the logic of all of the queries, and the accuracy of the calculations, and documented their review.
 A team member who did not perform the analysis reviewed the data correction of the basin names by checking 20 percent of all updated records.
 ERG had a senior staff member review all analyses and results.
   

References
       Drillinginfo, Inc. 2015. DI Desktop[(R)] March 2015 Download_CBI. DCN CWT00145
       Eastern Research Group, Inc. 2014. Environmental Engineering Support for Clean Water Regulations Programmatic Quality Assurance Project Plan (PQAPP). DCN CWT00295



