This content originally appeared on DEV Community 👩‍💻👨‍💻 and was authored by Rutam Prita Mishra
Introduction
MindsDB, undoubtedly, brings in the best-in-class machine learning capabilities to traditional databases. It acts as an AI layer on top of the existing tables and enables to train models and predict outcomes easily and instantly with the help of simple SQL statements.
MindsDB provides a huge collection of integrations with almost all the available databases and many ML Frameworks to make it easier for users to easily manage their existing data infrastructure along with MindsDB. Also, MinsdDB offers its service in two ways i.e., Local Deployments (Using Docker or PIP) and MindsDB Cloud, both of which offer a free tier to all their consumers.
In this tutorial we will be predicting the quality of wine based on its feature set using MindsDB Cloud.
Importing Data to MindsDB Cloud
We can obtain the free datasets from any of the available sources like Kaggle, Datahub, Google Dataset Search, etc to train our Predictor model. We will use this dataset from Kaggle for this tutorial.
So, let's get started with the process of adding this dataset to MindsDB Cloud.
Step 1: We need to first sign in to the MindsDB Cloud console or start registering for a new account.
Step 2: After logging in, it opens up the MindsDB Cloud Editor. We find a Query Editor at the top where we can write the queries and then hit the Run(Shift+Enter)
button above it to execute them, a Result Viewer at the bottom where the results are displayed for the queries and finally, we have the Learning Hub at the right side which works as a learning aid for the new users.
Step 3: Now find the Add Data
button from the top right corner and tap on it. Then switch the tab at the top to Files
instead of Databases
and click on the Import File
button.
Step 4: Now click on the Import File
section, browse the dataset file that we just downloaded from the above link and select it. Provide a name for the table in the Table name
field and simply hit the Save and continue
button to start the import process.
Step 5: We will get redirected back to the MindsDB Cloud Editor upon successful import of the dataset. The Query Editor will now have two basic SQL queries listed in it.
Let's execute them one by one and check the results.
The first query should show the list of available Tables. Make sure there's a table in the list with the name that we supplied above while importing the dataset.
SHOW TABLES FROM files;
The second query lets you check whether we have the right data records present inside the table that we just imported.
SELECT * FROM files.Wines LIMIT 10;
We are now ready with the data table and can now proceed to the next section.
Training a Predictor Model
With MindsDB, training a Predictor Model can be as easy as writing a SQL query and then executing it. So, let's see how we can do that by following the steps below.
Step 1: MindsDB offers a CREATE PREDICTOR
statement that we can use to create and train the model. The format of the syntax is as follows.
CREATE PREDICTOR mindsdb.predictor_name (Your Predictor Name)
FROM database_name (Your Database Name)
(SELECT columns FROM table_name LIMIT 10000) (Your Table Name)
PREDICT target_parameter; (Your Target Parameter)
An actualy query with real field_names
instead of the placeholders will look like the one below.
CREATE PREDICTOR mindsdb.wine_predictor
FROM files
(SELECT * FROM Wines LIMIT 10000)
PREDICT quality;
Step 2: The model might take some time to complete its training based on the size of the training data provided.
While we wait, we can check the status of the model with the command below. If the query returns complete
, then the model is ready to do the predictions. But, if it returns generating
or training
, it is advised to wait until the status is complete
.
SELECT status
FROM mindsdb.predictors
WHERE name='Name_of_the_Predictor_Model';
The real query will be something like this.
SELECT status
FROM mindsdb.predictors
WHERE name='wine_predictor'
As we got the complete
status, we can now do the predictions for wine quality.
Describing the Predictor Model
It is really very important to understand the details about our Predictor model before directly jumping in to do the predictions.
So, in this section we will try to figure out the details of our model in 3 different ways using the DESCRIBE
statement.
- By Features
- By Model
- By Model Ensemble
By Features
This query is designed to return the roles of each column in the table for the model and also mentions the specific encoders used on each of these columns to train the model.
DESCRIBE mindsdb.predictor_name.features;
By Model
This query fetches the list of all the available candidate models that were used during training. The candidate model which has 1
under its selected
column is selected to be used in the Predictor model and is supposed to have the best performance
value.
DESCRIBE mindsdb.predictor_name.model;
By Model Ensemble
This query is designed to provide us with a JSON output with the list of different parameters that helped to determine the best candidate model for the Predictor.
DESCRIBE mindsdb.predictor_name.ensemble;
It's time to now move on to the interesting part of predicting the wine quality values.
Querying the Model
MindsDB provides the ease of predicting target values with the help of the SELECT
statement only. Here, we can predict the quality of wine using a simple SELECT
query and ask the model to return us the predicted quality back.
It should be noted that the quality of wine is determined by multiple feature values altogether. The accuracy may degrade if some of these feature values are left out.
But let us still try to predict the wine quality based on a few feature sets with a query like this.
SELECT target_value_name, target_value_confidence, target_value_confidence
FROM mindsdb.predictor_name
WHERE feature1=value1 AND feature2=value 2,...;
Our real query will take the values of the feature parameter set like this.
SELECT quality,quality_confidence,quality_explain
FROM mindsdb.wine_predictor
WHERE pH=3.3 AND density=0.997 AND alcohol=9.4 AND sulphates=0.46;
Here the predicted quality
(Wine Quality) is 5, so the wine can neither be called good nor bad.
It's time to pass values of all the feature sets to have a more accurate prediction. Now that makes the query like the one below.
SELECT quality,quality_confidence,quality_explain
FROM mindsdb.wine_predictor
WHERE pH=3.5 AND density=0.9976 AND alcohol=13 AND sulphates=0.86
AND fixed_acidity=10.3 AND volatile_acidity=0.3 AND citric_acid=0.72
AND residual_sugar=2.5 AND chlorides=0.075 AND free_sulfur_dioxide=11
AND total_sulfur_dioxide=20;
This can be definitely called a wine of good quality as the value is 8.
That's it! We have now successfully predicted the wine quality using the Predictor Model.
Note: Parameters like
quality_confidence
,quality_explain
will fetch us the confidence value and other details like anomaly, truth value and probability classes respectively.
Conclusion
This marks the end of the tutorial. It's time for a quick recap now. Initially, we started with creating a MindsDB Cloud account, imported the dataset and created a table using the cloud GUI, created and trained a Predictor model, described its model details in three possible ways and finally predicted the quality of wine.
As this tutorial is over now, I would recommend all of you to create your own MindsDB accounts for free and then give it a spin. You can also install and use it locally. All the related instructions for doing this can be found here.
Lastly, before you leave, don't forget to key in your feedback in the Comments
section below and show some love by dropping a LIKE
on this article.
This content originally appeared on DEV Community 👩‍💻👨‍💻 and was authored by Rutam Prita Mishra
Rutam Prita Mishra | Sciencx (2022-10-15T18:41:49+00:00) Predict Wine Quality using MindsDB. Retrieved from https://www.scien.cx/2022/10/15/predict-wine-quality-using-mindsdb/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.