Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forecasting PM2.5 with Random Input Features and Sequential Predictions #10695

Open
rohan472000 opened this issue Aug 12, 2024 · 0 comments
Open

Comments

@rohan472000
Copy link

rohan472000 commented Aug 12, 2024

Here's a sample of my dataset:

PM2_5 RH(%) AT(°C) Hour Day Month Year Station_ID
45.3 60 25 23 1 8 2024 1
47.1 62 26 0 2 8 2024 1
49.0 61 25 1 2 8 2024 1
46.8 59 24 2 2 8 2024 1
44.5 58 23 3 2 8 2024 1

In my current project, I'm using a model (like XGBoost or Random Forest) to forecast PM2.5 levels for the next 15 hours. Here’s the process I'm following:

  1. Starting Point: I use the last known data point from my dataset, which includes features like RH (Relative Humidity), AT (Air Temperature), and the current PM2.5 value.

  2. Sequential Forecasting: For each subsequent hour, I predict the next PM2.5 value using the model. However, I'm assuming that the features like RH and AT remain constant, or I'm updating them using some basic logic (like keeping them the same as the last known values).

  3. Observation: The model successfully provides forecasts for the next 15 hours, but I’m puzzled about how it is able to produce these forecasts when the future values of RH, AT, and other input features are not explicitly known or modeled for these time steps.

My Questions:

  • How does the model (like XGBoost or Random Forest) handle the forecasting when the future values of input features (such as RH and AT) are not available? Is it purely based on the assumption that these features remain constant?
  • Is my approach of using the last known values for RH, AT, etc., reasonable for making such forecasts? If not, what are the recommended practices for handling these unknown future input features in a time series forecasting context?
  • In such scenarios, does the model primarily rely on the patterns it learned during training to make predictions, even if the input features are not dynamically changing for each forecasted time step?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant