Machine Learning Recommendation
Machine Learning Recommendation is a key feature into Oracle Analytics Cloud. Oracle Analytics Cloud is a single platform that empowers your entire organization to ask any question of any data using any device in any environment.
WHAT
Oracle Analytics Cloud
WHERE
Oracle
WHEN
Project Overview
That time there was Oracle Data Visualization, which became Oracle Analytics Cloud and that platform could visualize columns (attributes (or dimensions) and metrics) only if those were properly set up from the beginning. In case of demos those were set up, but in real life not. When one ingests the data there was only one simplified rule to deal with the data: numbers were metrics, everything else was considered as an attribute (or dimension). In real life phone numbers, zip codes and many numeric columns cannot be metrics at all: there is no real use case to see the average phone numbers or the standard deviation from any zip codes. Other numeric values like credit card numbers should not even be visible (and also should not be treated as numeric values as well).
2017
​
Project duration 5 months
-
I was responsible to design a proposal, which will lead to having the proper type for data columns
-
I had to define where and how to solve the problem
-
I delivered details plans with strong reality and viability check with developers to achieve this goal (the actual development happened in a waterfall way much later)
My Contribution
Challenges & Advantages
-
The design department worked in silos, so making actionable plans required to find a way to developers and PM supporters to articulate a real plan
-
I could leverage my preparation knowledge from earlier and there was a Machine Learning algorithm on the back end, which could serve to build up a knowledge base (even architecturally those pieces were available on the back end, there was no front end designed for that)
The Journey
Because of Machine Learning (ML), the topic got really high visibility and strong push, even if there were no developers allocated to work on that. Product Managers (PM) got excited and had their vision, but those were not necessarily aligned with user needs. Based on my experience it was important to know that even if ML was a hot topic the user has not really needed any algorithm, but help on their everyday tasks. The real blocker for the user happened at the moment when the user wanted to visualize something and could not.
For example, if something was considered as a metric the user could not visualize as a dimension. Some metrics are obvious to data scientists, like the number of records, but that is not generated by default and business people had even less understanding of the need. The actual software showed something like this:

The cursor indicated the issue, plus there was no visualization, but a text that the data was insufficient. Even more challenging part that just by watching the data does not bring you closer to the solution. A huge table with values shows everything is right.
There were multiple separate ideas to solve the issue in many areas:
1. right when the user adds the data
-
we could do a better automatic bucketing. There was already a product, which used a knowledge base and converted every column to the right type. Using that product sounded tempting but I have learned from the developers that none of the pieces were prepared with the right API layer to be connected and do their job, plus users are not keen on unknown automatic behaviors and the problem still could exist. A bucketed number can be a perfect dimension (like all the products sold for 100$ could be one bucket and number of the sold product could be a metrics), so the user could face the same problem
-
we could do assisted manual ingesting, which is the easiest way to align the use of the data and it required less compute but more labor intense
2. when the user wants to visualize the data
-
one may imagine to automatically change the type when dragging, but in case of a big data, it is not a real option. Convert can take a long time.
-
help the user to do what was needed
The design challenge with the help approach was to not to design another jarring clippy

The Solution
The proposal to the problem was a complex ML-based Recommendation system, which could be used in many areas consistently. Because 60% of the cases could be solved at the ingest level that was the proposed first development step. The in-place recommendation could solve the remaining ones. The in-place recommendation could not live without a more development intense knowledge base, which was trained by ingesting, that is why I proposed to implement that after, even if that will be the complete solution.
As a usability principle, I wanted to avoid black-box solutions (as that made all the users itchy), so transparent and actionable display of the "why" was an important rule. I have checked with the developers the validity of the concept in details and refined the proposal based on technical possibilities.
The first major component was a logic run as part of the data ingest and in the load dialog it was only a checkbox with a button. The checkbox could turn the profiling on/off (by default on) and next to that the button let the user go to the configuration are and see all the options to learn about the profiling itself

After the data is loaded the user get to the Explore area to see more details about it. The Recommendation engine gives a very modest small indication if more can be done on a specific element type.

In-place ineraction was as important as having all the recommendations collected in one place. I paid extra attention to have reusable components in the design, so the same components appear in multiple places. That helps the user to learn and makes the proposal cheaper for development. Sustainability is key to be able to implement it fast - that is why I introduced the same logic to pipeline builder as well.




Finale
I made extensive documentation about the plans, which was supported by POC development. Oracle Analytics Cloud is still a flagship product for Oracle.
