My Journey to Data Science – Part 2 of 3

My Career Journey Over the Last 20+ Years

October 26, 2016


Jamey Johnston (@STATCowboy)



This is Part 2 of a 3-part Blog series on my Career Journey the Last 20+ years into my current role as a Data Scientist. Part 1 can be found here!



I had been working for LDEQ for three years now and was one year removed from college when a good friend called. He was working as a consultant for Oracle Corporation in Houston. Oracle had signed a big contract with a client in Houston and needed DBAs on the project. He told me to send my resume so I could join him which I did! Two weeks later I was a Senior Consultant with Oracle Corporation working and living in Houston. It was a very exciting time of my life those three years. I had one of the best DBA mentors you could have, Clement C., teaching me about Oracle and technology and was getting to travel to some cool places in the US. Most importantly, though, I found the love of my life soon after moving to Houston, my wife, Melanie!

Oracle was a great 3 years for me and I learned so much that truly set the stage for the next 10 years or so of my life as a DBA. I enjoyed Oracle but after 3 years of travelling it was time to stop traveling and get a local job in Houston so I could spend more time with my wife and dogs! My wife will tell you I only stopped traveling for the dogs (not true though)!

This little-known company at the time called Enron (Ha Ha!) was hiring and I got hired on there! It was June of 2001 and Enron couldn’t be flying any higher (at least in their make-believe world!). We were one of the 10 largest companies (at least on invisible paper) in the world and the talent coming to work there was second to none. Over the next year and half I would learn a great deal about business as well as management as I was soon promoted to manage one of the core DBA teams at Enron. Enron went bankrupt and I moved to work for UBS when they bought the North American trading division of Enron.

After a brief stent at UBS they decided to get out of Houston and I went to work for a small consulting firm, Methods Technology Solutions, out of Baton Rouge. I did some travelling again and really had a lot of freedom to learn some new things and work on a variety of database projects from very small projects to bigger ones. I worked for Methods for 4 years, the last year and half consulting at a large independent Oil and Gas company. The Oil and Gas company offered me a full-time employment job and I have been there now for over 12 years.

The first 7 years at the O&G company I was a DBA team member working on admin, upgrades and a lot of tuning. It was fun! I had some time to write a little code, too! I created an end-user web dashboard for the app owners to see what was happening in the DB with their systems in C# ASP.NET. I created a multi-threaded application in C# to rapidly put together archived emails in to a bundle and create audit trails in XML, CSV, TXT and DB tables. However, I mostly did the typical enterprise DBA tasks.


Tuning SQL to Enhancing Business Processes and Relationships


I really enjoyed tuning database systems. I was a master of 10046 traces in Oracle and SQL Profiler and Extended Events in SQL Server and was able to tune a great deal of our applications. The funny thing about tuning databases when you are a DBA is you tend to look for that magic parameter, trace flag, bad SQL, etc. and often forget about the application and business processes behind what you are trying to optimize. More importantly, you often forget about the business people using those applications and how that data is used to add business value.

Anyone who knows me knows I am not shy! I love people and I love engaging with people. This led me to have a different approach to tuning. When I would get a call that something was slow I would find the business person who was having an issue, schedule an hour of their time in their office, have them reproduce the issue in the system while I watch both what they were doing in the application and also what was happening in the database. It was funny to see their face when the “DBA” would show up in their office. It was like an endangered species had left their native habitat and ended up in their office. Many would even tell me I have never seen a DBA before! This approach created a special relationship that allowed me to start understanding the business processes behind my company and fostering relationships to do more than just “tune database systems”.

I was put on a special project to implement a new business system and was actually placed for a year within the business team to assist with implementing the application. The application is a critical one for O&G companies and so I was dedicated to just working on that project and application. For the first time in almost 15 years I wasn’t a DBA! I wasn’t on-call (I didn’t know what to do with my new-found freedom). I didn’t have to do database upgrades. It was so weird to actually be a “customer” of the DBA team! It certainly gave me insight into how it felt to be on the other side of the table. I absolutely loved the project. I spent so much time understanding all the various data pieces and how they would flow into the new system (we had several different DBs that would need to integrate into this new system) and more importantly how that data truly worked for this particular part of the business. I really loved data! I realized then that I didn’t really want to be a DBA full-time (still like the part-time idea just to keep current but not very practical) but I wanted to work with data. I wanted to build datasets and analyze them. I wanted to find ways to better connect them together. I wasn’t quite on the statistics track yet but definitely the data and business side of Data Science.




As I began to understand the applications and business processes behind using the databases I supported, I also became aware of how so many of the systems were connected to one another either by ETL, database links, linked servers or virtually through various business processes. The connections of all the systems and more importantly, for me, the data led me on my next journey into BI.

I had become exposed to a tool called Spotfire while I was on the special project and learned how to use the tool and how people at work were using the tool. So, I began to work on connecting Spotfire directly and securely to the data sources and soon I was moved onto the BI team working on BI projects in Microsoft and Spotfire. I was a BI guy now!


Stayed tuned for Part 3 tomorrow – BI to Data Scientist!


My Journey to Data Science – Part 1 of 3

My Career Journey Over the Last 20+ Years

October 25, 2016


Jamey Johnston (@STATCowboy)


So I am at the EMP in Seattle attending the SQL Summit 2015 Appreciation Event and I am introduced to Denise McInerny (t) by my good friend Wendy Pastrick (t | b). Of course, the common pleasantries are exchanged, “My name is …”, “I work at XYZ company …” and “I do ABC job …”, which in my case is “My name is Jamey Johnston, I work at an O&G company and I am a Data Scientist”. Denise’s response was not quite what most people’s response is when I tell them I am a Data Scientist. Usually I get a general look of trepidation or the occasional, “Oh, you are an unicorn!” (true story, several times), but in Denise’s case she says “You should write a blog about your journey to become a Data Scientist” (or something along those lines). I thought that might be a fun blog to write and said “Sure!”. So here is the story of my journey to becoming a Data Scientist.


10 PRINT “My Name is Jamey”


So, I am an eight-year-old living in the suburbs of New Orleans and I want a computer, a Commodore Vic-20, and I get it! I plug it in and connect it to the color console TV in my living room and turn it on. Within 10 minutes I have written my first program:


10 PRINT “My Nname is Jamey”

20 GOTO 10

30 RUN

The next thing I see is the greatest thing I have ever seen – “My Name is Jamey” scrolling across my parent’s TV screen as fast as I can read it! I sit there and watch it go across the screen over and over and over again finally asking myself – “How do I stop it!?”. I scan through the manual that came with the Vic-20 looking for the correct key combination to stop it! No Luck and no Internet in 1982 so I do the only thing I can think of to stop it – “Unplug!”. I loved that computer and it was what fueled my desire to learn more about computers!


GIS and College


Fast forward to college and I am a 19-year-old who just finished his first year of college as a Psych major and I get a job with the Civil Engineering department at school working with GIS and Remote Sensing technologies. My father had started in GIS and Remote Sensing back in the 70’s for Wetlands research working for the Department of Interior so I was familiar with GIS and Remote Sensing and I was excited to work with big workstations, lots of data and getting to work in the same field as my dad! So these big workstations were SGI and DG boxes running from 25Mhz to 100Mhz processors! Your phone is probably 100 times faster!

Two years later I finish my junior year and I am still working on GIS and Remote Sensing projects, one of which was to run clusters against tiles of Thematic Mapper satellite data of the entire state of Louisiana and then use those clusters to classify them into land use and land cover categories (i.e. is this cluster water or agriculture or coniferous forest, etc.). I was working with ESRI and Erdas Imagine software and learning UNIX which was really fun and beneficial to my career.

I loved being a GIS and Remote Sensing technologist but I was still a Psych major! I wanted to change majors but the problem was my University only had ONE geography class! So there was no way I could stay at the University of Southwestern Louisiana and get a GIS degree. I transferred to LSU to start in the Fall of 2015 which had a great GIS program through the Department of Geography and Anthropology to get a BS in Geography with an emphasis in GIS and Remote Sensing. I also went to work for the GIS Center at the Louisiana Department of Environmental Quality (LDEQ) continuing my career in GIS and Remote Sensing. It took me two more years to finish school as I had to take 36 hours of Geography to graduate. My first semester was 12 hours of Geography with a Geography 1001 class up to a Graduate level class of Geography 4998 and a music class I believe. It was a fun semester to say the least.

I graduated in Spring of 1997 from LSU and continued working for the GIS Center in the field of GIS and Remote Sensing. I also started working with another cool technology, Relational Database Management System (RDBMS). ESRI had created the first version of ArcSDE which was a way to store your spatial data in a database. In my case it was Oracle. We had Oracle v6 and v7 databases at LDEQ and I was starting to learn them to support ArcSDE along with some other technologies that required RDBMS.


Stayed tuned for Part 2 tomorrow – DBA to BI!

Introduction to R and Data Science Tools in the Microsoft Stack

So you want to code like a Pirate! – R!


Jamey Johnston (@STATCowboy)


Thanks for all who attended my session at SQL Saturday Albuquerque (#SQLSatAlbuquerque). Below is the link to the scripts, documents and presentation materials from the session. If you have any questions or comments, feel free to hit me up on Twitter (@STATCowboy).


Download from OneDrive!


SQL Server 2016 Security Demo

RLS, Dynamic Data Masking and Always Encrypted

Jamey Johnston (@STATCowboy)

NOTE: Demo requires SQL Server 2016 CTP 2.4+ or Azure SQL Database v12.


Overview of the Demo


Security is a vital part of a DBA, Developer and Data Architects job these days. The number of attacks on databases of major retailors from criminals looking for information like email addresses, ID numbers, birthdays, etc. on individuals to be exploited and sold is ever increasing. The demo contained in the link at the bottom of this page is meant to showcase three of the new features Microsoft is bringing to the Azure SQL Database v12 and SQL Server 2016 to help combat this cyber-attacks:

Row Level Security (RLS)

Dynamic Data Masking

Always Encrypted


This first part of the demo is meant to show how an organizational hierarchy and asset hierarchy can be leveraged together to provide Row Level Security on tables in a SQL Server database using the new predicate based RLS feature in SQL Server 2016 and Azure v12. This demo is completely made up oil well production data for a fictitious organization with 153 fictional employees and come as in with ABSOLUTELY NO WARRANTY or GUARANTEE!

Also, the demo will show how to use RLS with the HieararchyID Datatype, the new Dynamic Data Masking and Always Encrypted Security Features.

This post is about the demo which will show you an advanced implementation template for RLS as well as some of the other security features. Please use the links above to the official Microsoft documentation to learn about each feature first before trying out the demo as it will help you understand the demo better.

If you attended the session at SQL Summit 2015, Overview of Security Investments in SQL Server 2016 and Azure SQL Database [DBA-327-M], which I co-presented with the Microsoft SQL Server Security team this is the demo we used at the end.


Asset and Organization Hierarchies and RLS


The basis of the security is nodes in the organizational hierarchy are granted access to levels in the asset hierarchy and those grants filter down to the members below in the hierarchy. This allows for inheritance of permissions via the Organization and Asset Hierarchy (i.e. Child nodes can inherit from Parent Nodes).

Functionality is built-in to the model to override the security model for a lower member including denying access altogether (‘NONE’ in the security map) or granting access to all data (‘ALL’ in the security map) via exceptions (SEC_USER_EXCEPTIONS). A base user table exists (SEC_ORG_USER_BASE) that has the relationship of employee to manager as well as the organizational unit id for the employee. This table would likely be fed from an HR system in a production scenario. A master table for the wells (WELL_MASTER) contains the asset hierarchy to filter security to the well data. Read the notes about the tables below for more details.

Below shows what an Asset and Organizational Hierarchy would look like (also, this is what is in the demo) and finally a walk down one branch of the organizational hierarchy to see how to apply RLS against the Asset Hierarchy.


Asset Hierarchy (snippet)



Organizational Hierarchy (snippet)


Asset and Organization Hierarchy and RLS (CEO to one USER)




Scripts and Explanations


There are 6 scripts to run to see the entire demo (and you just run them in the order they are number, 1 – 6):

1 – Oil&Gas RLS Demo – BuildTables.sql

The script will create the database, oilgasrlsdemo2016, and the tables needed for the demo

The Tables are as such:

  • ASSET_HIERARCHY – Table contains the asset hierarchy and is used to build the demo data for the asset hierarchy in the Well_Master table.
  • DATES – Generic table to hold a date dimension for use later in an Excel Power Pivot model.
  • SEC_ASSET_MAP – Table contains the entries mapping the organization units to the asset hierarchy levels for access to the data. The table would be managed by the security team for the application. Users and their subordinates are denied access to data via an entry of ou, ‘NONE’, ‘NONE’ or granted all access via an entry of ou, ‘ALL’, ‘ALL’.
  • SEC_ORG_USER_BASE – Table contains the employees including the employee to manager parent/child relationship to build an organization hierarchy and the organizational unit id for mapping to asset hierarchy levels for security. This table would likely be fed from an HR system. Also, will demonstrate Always Encrypted in this table.
  • SEC_ORG_USER_BASE_HID – Same as SEC_ORG_USER_BASE but includes HierarchyID column to demonstrate RLS with HierarchyID data types and to demonstrate Data Masking.
  • SEC_ORG_USER_BASE_MAP – Table contains the employee data including an entry (SECURITY_CLEARANCE) to denote the security clearance the employee is granted by walking down the organization hierarchy and finding the lowest level above including themselves that has been granted access to data. The SEC_ASSET_MAP table is used along with the SEC_ORG_USER_BASE table to generate the data in this table. The procedure REFRESH_SECURITY_TABLES is called to refresh the data in this table.
  • SEC_USER_EXCEPTIONS – Table contains entries to allow for overrides of the organization hierarchy based model. Any employee entered here will use permission defined in this table instead of what is inherited from the organizational hierarchy.
  • SEC_USER_MAP – This table is generated by the REFRESH_SECURITY_TABLES procedure and generates the asset level access for each user in the database based upon the values in the security tables SEC_ORG_USER_BASE_MAP, SEC_ASSET_MAP and SEC_USER_EXCEPTIONS. This is the ONLY table used by the functions for the Security Policy. The other SEC_ tables are used to generate this table for RLS.
  • WELL_DAILY_PROD – Contains COMPLETELY made-up and randomly generated daily well production data for Oil, Gas and NGL. Primary key is WELL_ID and RLS is achieved by using the asset hierarchy in the WELL_MASTER table to filter the WELL_IDs. This is a Clustered ColumnStore Indexed table.
  • WELL_MASTER – Contains COMPLETELY made-up and randomly generated master well data including the made up asset hierarchy. This is the main business table used for RLS for ALL well tables.
  • WELL_REASON_CODE – Contains COMPLETELY made-up and randomly generated daily well downtime data for Oil, Gas and NGL. Primary key is WELL_ID and RLS is achieved by using the asset hierarchy in the WELL_MASTER table to filter the WELL_IDs.


2 – Oil&Gas RLS Demo – LoadTables.sql

This script is used to load or generate the demo data including user and security tables and hierarchy and well data. There are two parameters close to the top that can be used to specify the amount of data to load – @wellcount and @drillyearstart. @wellcount specifies the number of wells to generate and @drillyearstart specifies the first possible year to use for a well. The start date for a well will be randomly selected between @drillyearstart and the current year the script is run.


3 – Oil&Gas RLS Demo – Security Setup.sql

This script sets up the RLS functions, policies and the procedure REFRESH_SECURITY_TABLES. The procedure REFRESH_SECURITY_TABLES is used to generate the RLS security mappings in the SEC_ tables as described in the sections above. In a production environment this procedure would need to be run every time the hierarchies were updated or new users were added to the database.

This script also will build users in the database based on the userids generated in SEC_ORG_USER_BASE table for testing RLS.


4 – Oil&Gas RLS Demo – Test RLS Security.sql

This script contains sample queries to test RLS at different levels in the organizational hierarchy and asset hierarchy. It also will add another entry in the security table granting a user access to another level in the hierarchy along with their current level and run the procedure to update the security to include this new entry.


5 – Oil&Gas RLS Demo – RLS Security with HierarchyID and Data Masking.sql


This script makes a copy of the SEC_ORG_USER_BASE table called SEC_ORG_USER_BASE_HID that contains a version of the Organizational Hierarchy using the HierarchyID Datatype. It shows how to populate a HierarchyID Datatype from a Parent/Child Hierarchy and will implement RLS security using the HierarchyID Datatype as well as add some new columns, EMAIL_ADDRESS and DOB (Date of Birth) to the table to highlight Dynamic Data Masking in SQL Server 2016.


Info about HierarchyID –


6 – Oil&Gas RLS Demo – Always Encrypted.sql

This script will setup the Master and Column keys for Always Encrypted as well as add a new encrypted column, SSN, to the SEC_ORG_USER_BASE table to test Always On Encryption. NOTE: If your database you are using for this demo is on a separate server than where you will run the application (see below) included in the download to test all the features you will need to run this script from SSMS on the machine you run the application and it will need .NET 4.6.


ERD of O&G RLS Demo DB




Demo Application


There is an application built in Visual Studio 2015 (.NET 4.6) that you can use to test out the demo once you run all 6 scripts (Source code is included as well!). Just run the EXE in the download on a machine with .NET 4.6 installed and the instructions are at the bottom of the application (see screenshot below).

Use the WELL_MASTER, WELL_DAILY_PROD and/or WELL_DOWNTIME to test the Parent/Child RLS Demo (Scripts 1 – 4). Use the SEC_ORG_USER_BASE_HID table to test the RLS with HierarchyID and Dynamic Data Masking (Script 5). Finally, use the SEC_ORG_USER_BASE table to test the Always Encrypted. You can EDIT the fields by clicking inside of them so for the Always Encrypted Demo you would click in the cell for SSN for an employee and enter a valid SSN and click “Commit” (see screenshot below).




Demo Instructions, Download and Contents


Below is the contents of the demo download and descriptions about each file/folder. To run through the demo, do the following:

  1. Download and extract the files to a folder
  2. Run all the scripts in the MSSQL 2016 folder in order, 1-6, on SQL Server 2016 CTP 2.4+ or Azure SQL Database v12.
  3. Then use the OGSecurityDemo2016.exe file to run the Demo Application

MSSQL 2016 – Folder with the 6 demo scripts
OSSecurityDemo2016 –
Folder with the source code (VS2015 .NET4.6) for the Demo Application
SimpleRLSExample –
Folder with a simple RLS Demo (start slow)
O&G_Demo_ERD.png –
Graphic of ERD
OGSecurityDemo2016.exe –
Demo Application (you can just run it if you have .NET 4.6)
Oil&Gas SQL Server Security Demo.docx –
Overview of the Demo


Link to download –!135238&authkey=!AF7fpcw5K5waySg&ithint=folder%2czip


Hope you enjoy and tweet me @STATCowboy if you have questions, comments or concerns or need help with the demo!

Setup Azure SQL Database using the New Azure Portal

April 7, 2015

Jamey Johnston

Azure SQL Database is a relational database-as-a-service that allows for scaling up to thousands databases. It is a self-managed service that allows for near-zero maintenance, in other words, Microsoft handles all the backups, patching and redundancy. Azure SQL Database offers service tiers that allows for dialing up or down the horsepower as needed which also means the pricing can scale up or down. Pricing of the Service Tiers range from ~ $5/month to the slowest and smallest in size to ~$3,270/month for the fastest and largest.

More Info on Azure SQL Database

In this post we will walk through the steps to create an Azure SQL Database using the new Azure Portal. Please note the new Portal is still in beta so the steps and screenshots may change as Microsoft still rolls out the new portal.

This post is based on the documentation on Azure, Get started with SQL Database by jeffgoll, which shows how to setup an Azure SQL Database using the older Azure Portal.


Step 1: Create an Account on Azure


This step can be skipped if you already have an account on Azure. If not, go to and click on the “Free Trial” in the upper right corner to get started.


Step 2: Logon to the New Azure Portal and Provision a Virtual Server and SQL Database


  1. Access the new Azure portal at and login with your Azure Account.

  2. Click “New” at the bottom left of the page, then “Data + Storage”, then “SQL Database” to start the SQL Database wizard.

  3. In the “SQL Database” panel enter your desired Azure SQL Database name in the “Name” field (e.g. myfirstazuresqldb) then click “Server / Configure required settings“. In the Server Panel pop-out to the side, click “Create a new server” (I am assuming if you are reading this tutorial you don’t have an existing Azure SQL Server setup!).




  4. In the “New Server” panel pop-out enter an Azure SQL Server name in the “SERVER NAME” field (e.g. myfirstazuresqlserver), a Server Admin Login account name in the “SERVER ADMIN LOGIN” field (e.g. SQLAdmin), a password in the “PASSWORD” field and enter the password again in the “CONFIRM PASSWORD” field. Choose the location where you want the server to reside by clicking in the “LOCATION” area and choosing the Location. Leave the “Yes” chosen in the “CREATE V12 SERVER (LATEST UPDATE) so we can learn some new features in a later blog post. Leave “ALLOW AZURE SERVICES TO ACCESS SERVER” checked. Click “OK” at the bottom of the “New Server” panel to continue.



  5. Back on the “SQL Database” panel leave “Blank Database” selected in “SELECT SOURCE” option and click on the “PRICING TIER” option and choose “B Basic” in the Pricing Tier panel pop-up and click “Select” at the bottom.




  6. On the “SQL Database” panel you can set the desired Collation but we will leave it the default for now. Click on “Resource Group”, then in the “Resource Group” panel choose “Create a new resource group” and then in the “Create resource group” panel type a resource group name in the “Name” field (e.g. myfirstresourcegroup”).



  7. Choose the “Subscription” (e.g. Visual Studio Ultimate with MSDN) to pay for the Azure service. Check “Add to Startboard” to have the SQL Database show up on the Portal front page. Then click “Create” to begin provisioning your new Azure SQL Database! 🙂



  8. The page will go back to the Portal front page and a tile will appear with the title “Creating SQL Database”. Also, in the side bar to the left you will see a notification.




  9. Once the database is provisioned the pages will refresh and show the Azure SQL Database Dashboard page.



  10. Your database is provisioned!



Step 3: Add Firewall Rule to Access Azure SQL Database


  1. Click on the Azure SQL Server under the “Summary” section.



  2. The “SQL Server” panel will pop-out to the right, click “Settings” to open up the SQL Server configuration panel to the right.



  3. In the “Settings” panel click on “Firewall”. In the “Firewall Settings” panel that pops-out enter a “RULE NAME” (e.g. Home), the START IP and END IP which will probably be the same IP address if testing from home. You can use a website like to get your external IP address. Click “Save” at the top once finished entering the values. (NOTE: The values below are not valid and you should enter the correct ones for your location).

    Your IP may change so check periodically to make sure you have the correct IP addresses in the firewall rules. Also, if trying to connect from work your work firewall may block access to the default port of 1433 which is used for access to the Azure SQL Database so if you are having issues trying to connect from work they may be your issue! J



  4. A message should appear indicating the firewall rules were successfully updated. Click “Ok”.



  5. Firewall is configured.


Step 4: Setup SQL Server Management Studio to Manage and Access Azure SQL Database


  1. Finally to test your new Azure SQL Database download the SQL Server 2014 Management Studio Express and install using the defaults. Choose the 32-bit or 64-bit version depending on your O/S version (probably 64-bit for most).





  2. Launch SQL Server 2014 Management Studio from your PC and login to the server you created in Step 2 using the server name, admin account and password and click “Connect”.





  3. Congratulations you have created your first Azure SQL Database!