2

I am working on a Machine Learning Flask project on Electric Vehicle Price Prediction to deepen my practical skills. I have completed everything, like data exploration, model creation, and running the Flask app on localhost.

After running the app on localhost, I fill out the form fields with necessary information to get the prediction result of electric vehicles. But when I click on the submit button, I get this error:

ValueError
ValueError: Found unknown categories ['98368'] in column 2 during transform

Traceback (most recent call last) File "C:\Users\2021.conda\envs\electric_vehicle_price_prediction_2\lib\site-packages\flask\app.py", line 1498, in call return self.wsgi_app(environ, start_response) File "C:\Users\2021.conda\envs\electric_vehicle_price_prediction_2\lib\site-packages\flask\app.py", line 1476, in wsgi_app response = self.handle_exception(e) File "C:\Users\2021.conda\envs\electric_vehicle_price_prediction_2\lib\site-packages\flask\app.py", line 1473, in wsgi_app response = self.full_dispatch_request() File "C:\Users\2021.conda\envs\electric_vehicle_price_prediction_2\lib\site-packages\flask\app.py", line 882, in full_dispatch_request rv = self.handle_user_exception(e) File "C:\Users\2021.conda\envs\electric_vehicle_price_prediction_2\lib\site-packages\flask\app.py", line 880, in full_dispatch_request rv = self.dispatch_request() File "C:\Users\2021.conda\envs\electric_vehicle_price_prediction_2\lib\site-packages\flask\app.py", line 865, in dispatch_request return self.ensure_sync(self.view_functions[rule.endpoint])(view_args) # type: ignore[no-any-return] File "G:\Machine_Learning_Projects\2024\electric_vehicle_price_prediction_2\app\routes.py", line 38, in predict price = predict_price(features) File "G:\Machine_Learning_Projects\2024\electric_vehicle_price_prediction_2\app\model.py", line 29, in predict_price transformed_features = encoder.transform(features_df) File "C:\Users\2021.conda\envs\electric_vehicle_price_prediction_2\lib\site-packages\sklearn\utils_set_output.py", line 157, in wrapped data_to_wrap = f(self, X, *args, kwargs) File "C:\Users\2021.conda\envs\electric_vehicle_price_prediction_2\lib\site-packages\sklearn\preprocessing_encoders.py", line 1027, in transform X_int, X_mask = self._transform( File "C:\Users\2021.conda\envs\electric_vehicle_price_prediction_2\lib\site-packages\sklearn\preprocessing_encoders.py", line 200, in _transform raise ValueError(msg) ValueError: Found unknown categories ['98368'] in column 2 during transform

Here is my code in routes.py file

from flask import render_template, request, jsonify
# , Environment, PackageLoader, select_autoescape
from app import app
from app.model import predict_price
from jinja2 import Environment, FileSystemLoader, PackageLoader, select_autoescape

@app.route('/')

def index():

return render_template('index.html')

@app.route('/') def index(): env = Environment( loader=PackageLoader("app"), autoescape=select_autoescape() ) template = env.get_template("index.html") return render_template(template)

@app.route('/predict', methods=['POST']) def predict(): data = request.form.to_dict()

# Convert the form data into the correct format for prediction
features = [
    data['county'],
    data['city'],
    data['zip_code'],
    data['model_year'],
    data['make'],
    data['model'],
    data['ev_type'],
    data['cafv_eligibility'],
    data['legislative_district']
]

# Get the prediction result
price = predict_price(features)

return jsonify({'predicted_price': price})

Here is my code in model.py file:

import pandas as pd
from sklearn.preprocessing import OneHotEncoder
from sklearn.ensemble import RandomForestRegressor
import joblib
from flask import Flask, render_template
# Environment
# , FileSystemLoader
# , PackageLoader, select_autoescape
from jinja2 import Environment, FileSystemLoader, PackageLoader, select_autoescape

env = Environment( loader=PackageLoader("app"), autoescape=select_autoescape() )

Load the pre-trained model (make sure to save your model from the notebook)

model = joblib.load('model/ev_price_model.pkl')

def predict_price(features): # Preprocess the features # For simplicity, assume you have transformed your dataset in the same way as in the notebook.

# OneHotEncoder or preprocessing steps as in your notebook
encoder = joblib.load('model/encoder.pkl')  # Load encoder if needed

features_df = pd.DataFrame([features], columns=['County', 'City', 'ZIP Code', 'Model Year', 'Make', 'Model', 'Electric Vehicle Type', 'Clean Alternative Fuel Vehicle (CAFV) Eligibility', 'Legislative District'])

# Apply encoding, scaling, etc., if necessary
transformed_features = encoder.transform(features_df)

# Make the prediction
price = model.predict(transformed_features)

return price[0]  # Assuming it returns a single value

Here is my GitHub repo: https://github.com/MdEhsanulHaqueKanan/electric-vehicle-price-prediction-2

How to fix the issue?

desertnaut
  • 2,154
  • 2
  • 16
  • 25

1 Answers1

2

Based on the error it seems you are using the OrdinalEncoder to encode values in your pipeline. The default way that this handles unknown values is to throw an error (see the documentation), which is the error you're getting. If you want to keep using this encoder changing the value for the handle_unknown parameter to use_encoded_value and providing a value for the unknown_value argument should fix the issue.

Oxbowerce
  • 8,522
  • 2
  • 10
  • 26