PGE HACKATHON 2022

Team: NFL - No free Lunch

Executive Summary

For this challenge, we intend to use Machine Learning to predict the location of 3 infill wells and forecast their production for the next 2 years. In order to achieve this, we performed exploratory data analysys of the given datasets. This step included data preparation, feature imputation and feature ranking. Once we narowed down the predictor features for our model, we used a 70/30 train-test split. We decided to use Decision tress because they are one of the most powerful, cutting edge methodology in machine learning and they do not need standarization or normalization.

WORKFLOW DESCRIPTION

Disclaimer: Most of this workflow has been prepared using code snippets of the workflows designed by Dr. Michael Pyrcz and Dr. John Foster from The University of Texas at Austin. Thank you for all the support

Import Required Packages

Declare functions

Let's define a couple of functions to streamline plotting correlation matrices and visualization of a machine learning regression model responce over the 2 predictor features. Functions written by Dr. Michael Pyrcz

Loading Tabular Data

Here's the command to load our comma delimited data file in to a Pandas' DataFrame object.

Summary statistics of the dataset

Copy df and TRUNCATE Negative values

FEATURE IMPUTATION

Summary Statistics

Summary statistics of one variable at a time. The describe command provides count, mean, minimum, maximum, and quartiles all in a compact data table. We use transpose() command to flip the table so that features are on the rows and the statistics are on the columns.

MULTILINEAR REGRESSION

MultiLinear Regression for LOWER completion zone

$\kappa \sim \phi^3/(1−\phi)^2$

UPSCALING: from Well-log scale to Well scale

ACOUSTIC IMPEDANCE

FEATURE RANKING

VISUALIZATION

MACHINE LEARNING MODEL - DECISION TREE - Oil Rate

MODEL PREDICTION FOR NEW WELLS - OIL RATE (bpd)