Can I eat this?

Predicting an undocumented mushroom's edability with AI




Have you ever been in the situation where you were lost in the woods, starving, and you run into a mushroom that may or may not be edible? Fortunately for you, you have your cell phone, a stable connection to vemity.com, and a convenient mushroom edibility dataset! In this situation, your only hope for survival is to train an AI to predict whether or not the mushroom is edible! Let's get learning before you starve!



Setup

Before starting, make sure you have a Vemity account. This is the AI as a Service tool we will be using to train and inference the AI. If you don't have an account, create one here. You'll also need to download the dataset we'll be using.

Training

1. Navigate to the my models page on Vemity.

2. Click the New Model button at the top right of the page.

3. Our dataset is all data, meaning that every cell in every row represents something else. It is not an image, it is not a full sentense of text, it is simply a letter or number corresponding to something else. Click Data Classifier.

4. Name your AI "Mushroom Prediction" or whatever else you want to call it. I don't really care. Beneath that, write a one sentence description as to what the AI will do. This helps identify the AI within your organization and helps our system learn and get better. To predict whether or not a mushroom is edible or poisonous.

5. Click Upload your training data

6. Click the upload button and find where you saved the dataset.

7. Here we need to tell the system a little bit about our data. It may help to open up the dataset and follow allong with what we are entering.
  1. If you scroll through our dataset, you will find an occasional ?. This is a symbol representing that, for whatever reason, there was no result for that cell. This is okay and pretty typical, so we can handle it here.

    Enter ? here.

  2. Does our dataset include strings? In other words, is there anything in the dataset other than numbers? In this datset, yes.

    Mark this as true.

  3. Should we drop any columns? When you train an AI, it's very sensitive to data. If there is any data in the dataset that does not directly apply to the desired prediction, it should not be included. ID columns should always be dropped, as they do not contribute to the prediction.

    Drop the id column.

  4. Finally, which column do we want to predict? The class column is what corresponds to the mushroom's edability. e means edible, p means poisonous.

    Select the class column.


8. Here we can choose how to train the AI. In this tutorial, we are going to choose Optimal Training.

9. We can see on this page that each epoch, will charge 407 credits. An epoch means everytime the AI reads through the entire dataset. This dataset should take no more than 15 epochs to train to a good accuracy. This will charge us 1219 credits at most to train. Because we are using Optimal Training that means that if the AI trains before we reach 15 epochs, it will automatically cut it off and save us credits.

10. When you're ready, click train. Your results will pop up when it's done training. It's pretty easy to get 100% accuracy on this dataset. That doesn't mean it will always be 100% right, but when it was tested against 10% of the dataset, it got every prediction correct.


Inferencing

You're basically done! At this point, the AI has been trained. All you need to do from here is use it! There are two ways to do that:

Dashboard

To test the AI before you put it in your software, navigate to the models page, find your new AI, then click view. On this page, you can view some insights on the AI. Most importantly, at the bottom of the page, you can test the AI by sending in JSON formatted data.

Try testing ["b","s","n","t","p","f","c","n","k","e","e","s","s","w","w","p","w","o","p","k","s","u"].

If all goes well, you should recieve something like this:
  {
	"converted_score": [
		{
			"confidence": 0.9997950196266174,
			"selection": 0
		}
	],
	"error": "none",
	"raw_score": [
		[
			0.9997950196266174,
			0.00020505618886090815
		]
	]
}

Firstly, the converted_score dictionary represents the index of the result chosen. This corresponds to the array just above the response box.

The error field will offer help if your model is underperforming.

Finally, the raw_score dictionary tells you the percentage likelihood (as predicted by the AI) of each result. This corresponds with the results array above as well.

API

When you're using your AI in your software, you'll need to use our API. This is simple! You are going to send an HTTP POST request to https://app.vemity.com/api/inference. The content should be contained in a JSON body in the following format.
  {
    id: "YOUR_MODELS_ID",
    input: [["input","data"],["as","a","2D","array"]],
    email: "YOUR_EMAIL",
    key: "YOUR_API_KEY"
  }

Your model's ID, and your API key can both be found on the model view page.

It's also important to note that the input data should be placed in a two-dimensional array. This is an array with other arrays in it. The internal arrays should have the data you want the AI to process. Multiple internal arrays will produce multiple results. This allows you to inference a number of times in one API request.

To test it out, try it in a curl request. Paste the following into your terminal and change the neccessary values.

  curl -H "Content-Type: application/json" -X POST -d '{"id": "YOUR_MODELS_ID","input": [["b","s","n","t","p","f","c","n","k","e","e","s","s","w","w","p","w","o","p","k","s","u"]],"email": "YOUR_EMAIL","key": "YOUR_API_KEY"}' https://app.vemity.com/api/inference


You're done!

The next step is all up to you. Think of great ways to apply this knowledge and the Vemity platform.

If you're stumped or want help with your idea, send a support request on our support page.



© 2017 Vemity