Along with the Hortonworks Oil and Gas team, I have been working closely with Laurence Sones, senior petrophysicist, to understand how Hadoop-based Data Discovery is enabling Geologic and Geophysical (G&G) teams to improve decision-making across their assets. What follows is a Q&A session with Laurence discussing his perspectives on data discovery.
Kohlleffel: Laurence, you have a wealth of experience in the oil and gas industry. Please discuss your background and some of the roles that you have taken on.
Sones: Sure, I began as a field engineer with Schlumberger in the logging and perforating area. Following that, I moved into wireline sales and then did a stint as a Service Quality Manager for both open hole wireline and cased hole wireline. In addition, I was a well placement engineer for Schlumberger and then moved to Anadarko performing geosteering. Lastly, I was with Forest Oil as a petrophysicist.
Kohlleffel: Can you discuss both the geological analysis (surface, subsurface, and core drilling) and geophysical analysis (seismic, gravity, magnetic, electrical, geochemical) processes? How does working with a broad set of data allow you to make a decision or recommendation regarding high potential areas?
Sones: My foundational understanding of being able to perform effective log analysis starts with the time that I spent in the field and being able to identify high quality or poor quality log data based on how the data is collected, and then understand all of the various parameters at the time of acquisition. The time that I have spent in the industry has also given me a clear view of the people that are the users of the data and the applications that they use.
Initially, we review production for an area and type curves for production are developed and reviewed by reservoir engineers and geologists. Next, both geologists and petrophysicists review well logs and establish a basic petrophysical model based on rock type, fluid type, etc. to find a good correlation between the properties recorded on logs and the actual physical properties of the rocks. With this, water saturation, effective porosity, and net pay can be calculated and the acreage can be graded based on those properties. Reservoir modeling may be done with volumetrics that incorporate petrophysical properties and production data with which to correlate the production to the petrophysical properties.
Looking at the geological structure is also part of standard analysis of a play – mapping formation tops and fluid contacts, if those are present, and also we also review seismic data when available.
Lastly, we do core analysis, which incorporates multiple datasets including rock composition, water/oil saturation, porosity, permeability, SEM, geological description, mechanical properties which can all be used for an advanced petrophysical model. Typically, we can develop a strong correlation between the recorded log properties and the actual properties of the rock.
Kohlleffel: What are some of the manual processes involved in working with all of these disparate datasets?
Sones: Looking at the most recent field that I worked in, it was manually intense, and 90% of my time was spent in QC of logs, data quality review, and properly identifying the curves that were in the files to ensure that the proper information was being used to perform the analysis. The final 10% of my time on a project was available for the fun part, the analysis.
All of the manual tasks take a significant amount of time with any size field. Commonly, a geophysicist might start by checking depth references for every single well on a public site for verification to ensure that the highest quality data is being used and that at the end a high quality result is produced.
Kohlleffel: Operators are under tremendous pressure to reduce costs and this is putting enormous organizational and financial pressure on existing models. Can you comment on the feedback we are getting from operators that the use of Hadoop as an economical platform for advanced analytics is key to their ability to deliver an optimized cost model?
Sones: Producers are looking for any way to reduce costs, and I see multiple ways to do this with advanced analytics driven by Hadoop, whether it’s effectively checking your depth reference or having a powerful sensitivity analysis for driving cost down and understanding where you should be drilling. I am seeing an increased number of techs hired in some places to manage databases and data sources, but it’s not a replacement for optimization and what advanced analytics with Hadoop can bring to the table. It’s really just throwing more manpower at the problem versus applying better technology, which could benefit techs, geologists, engineers, and petrophysicists.
Kohlleffel: Laurence, please expound on the datasets that are critical to a G&G organization – can you go through those in more detail and describe the challenge in getting a single view or comprehensive map that includes the relevant data?
Sones: I’m glad to. The primary dataset is that of log data which can be recorded multiple times for different measurements on the same well in many cases. Log data establishes the foundation for analysis for geologists and petrophysicists. Production data is also critical and it can reside in multiple sources; generally it’s pulled into a primary geological analysis software application.
In addition to that, the seismic data is almost always on a separate platform being used by the geophysicists. I’ve mentioned some of the data in-house, but you also have all of the legal information that’s in the public domain on the state commission websites – legal location, legal well name, API numbers, depth records, elevation, etc.
Kohlleffel: How do you feel that Hadoop is helping companies across the industry address this proliferation of data silos as well as the manual QC process?
Sones: Hadoop is well suited for the data discovery required by the G&G community because as a centralized data platform it allows us to ingest all information for a single view. This includes structured data such as production and completion records, semi-structured data such as well logs, and unstructured data such as spreadsheets and PDFs. From this wide variety of datasets, we create our own “path” to the data and combinations of datasets that we feel are most important for our analysis. That’s important, because not being constrained to a prescribed path allows for complete freedom of data discovery by an individual, and allows us to ask questions that we hadn’t considered before. We may want to perform focused analysis on a small subset of wells or perform analytical processing on an entire basin or reservoir. Hadoop makes either scenario possible.
Furthermore, we can use the end user visualization tools that we are already familiar with to do sensitivity analysis in order to get the clearest picture of what is driving production. Some of the new areas that I am exploring with Hortonworks include leveraging Hadoop to bucket curves into analysis classes, auto zoning, metadata correction, machine learning for enhanced sensitivity analysis and data exploration, and batch processing of LAS files for various conversion metrics.
Kohlleffel: How important is it to you to have a 100% open source approach versus a partial open approach?
Sones: It’s an interesting question, because I’ve repeatedly seen how critical it is to be able to uptake innovative aspects and new features of software quickly and I believe that the Hortonworks’ 100% open source approach to Hadoop provides the Oil and Gas industry with a distinct advantage to other approaches.
Kohlleffel: Laurence, I want to thank you for your time today and we look forward to future discussions.