                          eNOI Data Processing Steps

Data from EPA's eNOI system was downloaded in August 2012 in order to analyze MSGP DMR records. The data from separate eNOI tables were combined into an Access database for processing (see MS Access file: eReporting_8-27-2012.mdb). Most of the processing/analysis of the data was performed in a separate MS Access file (see MS Access file: eReporting Analysis, r00.accdb) with some supporting steps performed using MS Excel (see QBM Analysis, rev 01.xlsx). We followed the below steps in processing the data. A more detailed description is provided in the MS Word file, $Documentation.docx.
Initial Data Processing Steps  
   1) Data from the following eNOI tables were queried for the purposes of this analysis: PT_STATUS, RPT_FACILITY RPT_MONITORING, ERPT_DISCHARGE_LOCATION_ID, and RPT_DISCHARGE. We started with 42,351 (485 facilities) monitoring records. 
   2) The following initial steps were performed to remove un-needed data records, leaving 23,835 records (385 facilities):  
      i) Deleted records where the parameter name was not provided.
      ii) Deleted facilities that were not "Active".
      iii) Deleted monitoring data not associated with the benchmark monitoring program and/or effluent guideline monitoring program. 
      iv) Deleted monitoring results identified as "not applicable".
Benchmark Monitoring Data Processing Steps  
   1) It was determined that the eNOI program had allowed users to enter the parameter name in an open format approach. Using the data from the initial processing steps, those inconsistencies were resolved with simplifications. For example, it was determined to treat parameters reported as "Lead", "Total Lead", and "Total Recoverable Lead" as the same. Numerous miss-spellings were also addressed.
   2) Metals with hardness-specific benchmarks processing steps
         a. Select records from step 1 associated with the benchmark monitoring program for the six metals with hardness-specific benchmarks (n=6,155).
         b. Records with missing hardness data were eliminated (n=4,973).  
         c. Records with non-standard quarterly reporting periods were eliminated (n=4479).
         d. Records with missing or zero concentrations reported as `detected' or as `detected but below QL' were eliminated (n=4371, 3539 detected observations, 148 facilities). 
         e. As per the permit requirements for computing yearly averages, the concentrations for `non-detects' was recoded to zero and concentrations for `detected but below QL' was recoded to one-half the quantitation limit.
   3) Other parameter processing steps
         a. Select records from step 1 associated with parameters that have constant benchmarks (n=14,734).  
         b. Records with non-standard quarterly reporting periods were eliminated (n=13,591).
         c. Records with missing or zero concentrations reported as `detected' or as `detected but below QL' were eliminated (n=12,288 w/ 9943 detected observations, 355 facilities). 
         d. As per the permit requirements for computing yearly averages, the concentrations for `non-detects' was recoded to zero and concentrations for `detected but below QL' was recoded to one-half the quantitation limit.
   4) The resulting data were sorted by permit, outfall, parameter, and reporting sequence (i.e., ERPT_MONITOR_EVENT_ID). The reporting sequence was used to sort reporting data periods into an approximate chronological order. Using this order, sequential observations from the same reporting period were averaged to a single value for a given reporting period.
   5) Yearly averages were then computed whenever the four consecutive records included data from each of the standard quarterly reporting periods. (Each record could only be used in one yearly average, i.e., no sliding yearly window.) Based on the available data, one facility could have from 1-3 yearly averages for a given outfall/parameter combination.
         a. There were 1,334 yearly averages meeting the above requirements.
   6) The resulting yearly averages were combined with the benchmark monitoring requirements at the subsector level, compared to their corresponding benchmark values, and results summarized to the sector level. 
         a. For example, PRR05BH95 is associated with subsectors C.1, C.2, C.3, and C.4. The requirements for these subsectors are listed below. The analysis for PRR05BH95 would then include the results for iron, lead, nitrate+nitrite, phosphorus, zinc, aluminum in Sector C without duplication (i.e., even though zinc is listed under three separate subsectors, the zinc results for PRR05BH95 is only analyzed once)
Subsector
Simplified Parameter Names
C.1
Iron, Lead,  Nitrate + Nitrite, Phosphorus,  Zinc
C.2
Aluminum, Iron, Nitrate + Nitrite
C.3
Nitrate + nitrite, Zinc
C.4
Zinc
         b. In another example, MAR05CW65 is associated with subsectors L.1 and O.1 with a monitoring requirement of TSS and iron, respectively. In this case, the TSS results for MAR05CW65 are included under Sector L and the iron results are included under Sector O, only.
         c. An overall table is provided that analyzes all the data regardless of the facility's monitoring requirements.

ELG Data Processing
   1) It was determined that the eNOI program had allowed users to enter the parameter name in an open format approach. Using the data from the initial processing steps, those inconsistencies were resolved with simplifications. For example, it was determined to treat parameters reported as "Arsenic", "Total Arsenic", and "Total Recoverable Arsenic" as the same. Numerous miss-spellings were also addressed.
   2) Select measurements associated with the ELG monitoring program (n=3,299).  (Unlike the benchmark monitoring program, we were only interested in whether reported results for detected observations exceeded the ELG concentrations. Thus, no screening for sampling quarter, hardness, or presence of concentrations was considered.)
   3) ELG concentrations were extracted from the MSGP permit by sector. For some combinations of sector and parameter, there were multiple ELG concentrations, typically for the monthly average values and daily maximums. We chose to reduce this to the daily maximum for consistency with the collected measurements. 
   4) Some facilities belong to more than one sector. As a result when the 3,299 records from above are joined to the Sector-Facility lookup table, we get 5,327 records for analysis.
