Bayesian MCMC model and Bayesian regression analysis model #1819

Peridein · 2024-07-02T05:05:15Z

I wonder the difference between Bayesian MCMC model and Bayesian regression analysis model.
The tensorflow_probability library is being used. The code has been imported from Excel into the target value and various related values, and we are creating a model that predicts the target value only with related values and sets residuals when comparing actual data and predicted values to examine errors.

The libraries used in this process are.

import tensorflow as tf
import tensorflow_probability as tfp
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

tfd = tfp.distributions
tfb = tfp.bijectors

Defining the reliability (pre-probability) of the true value
sensor_accuracy = 0.90
prior_loc = tf.cast(actual_value, dtype=tf.float32)
prior_scale = tf.cast((1 - sensor_accuracy) * actual_value, dtype=tf.float32)
After defining the model

joint = tfd.JointDistributionCoroutine(model)
num_results = 10000
num_burnin_steps = 500

def target_log_prob_fn(predicted_value):
return joint.log_prob(predicted_value, n)

initial_state = [tf.constant(actual_value, dtype=tf.float32)]

hmc = tfp.mcmc.HamiltonianMonteCarlo(
target_log_prob_fn=target_log_prob_fn,
num_leapfrog_steps=20,
step_size=0.01)

@tf.function
def run_chain():
return tfp.mcmc.sample_chain(
num_results=num_results,
num_burnin_steps=num_burnin_steps,
current_state=initial_state,
kernel=hmc,
trace_fn=lambda current_state, kernel_results: kernel_results)

samples, kernel_results = run_chain()

I started coding for the first time, so I got advice from people around me, and also used chat gpt to construct the code. As a result, I am asking a question because I do not understand clearly whether the concept that made up this model is Bayesian MCMC or Bayesian regression analysis due to the lack of understanding of the code overall.
Thank you for reading. Please give me a lot of advice.

chrism0dwk · 2024-07-06T08:01:36Z

Hello @Peridein,

I think there is some more fundamental understanding of statistical modelling you need, which almost certainly goes beyond the scope of TFP (which is, after all, a library for implementing Bayesian statistical models, rather than understanding Bayesian statistics). You may be better off asking the question on a statistics forum (e.g. StackOverflow with appropriate tags). However, let's see if I can help you a little here.

Regression

Regression is the practice of determining coefficients (parameters) for a model in order to predict an outcome from a set of covariates (i.e. features). A common example of regression is linear regression where for the $i=1,\dots,n$th set of covariates $x_i$ and outcome $y_i$, we assume that

$$y_i = \alpha + x_i^T\vec{\beta} + \epsilon_i$$

where $\alpha$ is the intercept (i.e. mean value of $y_i$ if all covariates are equal to 0), $\vec{\beta}$ is a vector of unknown coefficients (parameters), and $\epsilon_i$ is an error term. What makes this special is that we assume that all the $\epsilon_i \sim \mbox{Normal}(0, \sigma^2)$ error terms are drawn from a Normal distribution with mean 0 and (unknown) standard deviation $\sigma$.

So, regression is a particular type of modelling concept.

Bayesian statistics

To say a model is Bayesian is to say something about how we want to estimate the unknown coefficients $\vec{\beta}$. In Bayesian statistics, we place prior probability distributions on all unknown quantities, effectively expressing what we believe to be their values in advance of observing the outcome data $\vec{y}$. If we didn't know much about the coefficients in the regression model above, but we believed that their values might be closer to 0 than to either $\infty$ or $-\infty$, we might write the Bayesian version of this regression (i.e. Bayesian regression) model as

$$\begin{align} \alpha & \sim \mbox{Normal}(0, 1000) \\\ \beta & \sim \mbox{MultivariateNormal}(\vec{0}, 1000 I) \\\ \sigma & \sim \mbox{Gamma}(0.1, 0.1) \\ y & \sim \mbox{MultivariateNormal}(\alpha + x^T\beta, \sigma^2) \end{align}$$

We are now interested in estimating the a posteriori distributions of $\alpha$, $\vec{beta}$ and $\sigma$, which is to say their probability distributions conditioning on having observed the outcome variables $\vec{y}$ (which, if you think about it, restricts the values that the coefficients can take over and above their individual prior distributions). Bayes' Theorem tells us how this is possible, and the Reverand Thomas Bayes therefore lends his name (very posthumously!) to the Bayesian statistical way of thinking.

To understand this properly, I would very much recommend some reading on Bayesian statistics, in particular the first few chapters of the highly regarded "Bayesian Data Analysis" by Gelman et al..

MCMC

In the above two sections, we've constructed a Bayesian linear regression model, but so far we've not said anything about how we obtain estimates for the coefficients. MCMC (Markov chain Monte Carlo) is a method for drawing random samples from the posterior distributions of the coefficients. It is particularly useful in cases where we have a lot of unknown quantities and their posterior distributions cannot be calculated analytically (which in practice is the case for all but the simplest of models). In the code above, you use HamiltonianMonteCarlo, which is a flavour of MCMC.

Conclusion

In conclusion, I think you need to go back to the drawing board and understand what you are trying to achieve before asking ChatGPT to summon up some code for you. Actually, I think the code above is non-executable, and as a complete example doesn't actually implement any kind of regression model, Bayesian or otherwise. Happy to help further if you can refine your use case, and modelling objective.

Regards,

Chris

Peridein · 2024-07-08T02:22:55Z

Dear Chris, Thank you for your kind reply.
First of all, it is true that I do not fully understand the concept of Bayesian. I was working on a project that included concepts that I had never seen before, and I lacked depth in understanding concepts because I lacked time. That's why I feel ashamed because I felt that my attitude toward coding was insincere. Still, thank you very much for your sincere response.

The project I am working on now is about a specific sensor of the heat pump system (hereinafter referred to as A sensor). We predict the value of A sensor only with the data of other sensors associated with A sensor. In this process, we are creating a model that detects an error if an error occurs in A sensor by comparing the actual measured value of A sensor with the predicted value of A sensor. The theoretical concept of the constructed equation is based on the heat capacity equation as follows.

Q = CmΔT = mΔh

The equation of the model being created consists of an equation without parameters. In one calculation, all other values are determined from the values previously obtained through experiments. In the equation, all other values are determined except for the predicted value of sensor A. However, the predicted value of sensor A is derived from the mean value and standard deviation, not from the exact one value.

In the code above, the likelihood by Bayes' theorem is based on the precision of Sensor A. Influenced by Sensor A's precision during each operation, the model included the value of the probability that Sensor A measured was true.

To sum up, I designed a model for failure detection and correction of sensor A in the heat pump system. The model was formed by including the precision of sensor A and other sensors affected by sensor A. What is important in the calculation process is not the process of deriving the parameters for each sensor value. What is important is the answer to the question, "What is the probability that sensor A's predicted value will come out probabilistically and what are the mean and standard deviation of the value?"

I'm currently trying to study the theoretical concepts that I lack, and I'm not sure that the concepts I understand and the goals of the model are the same. I'm still studying Bayes Organization and MCMC at the same time, but I'm asking you a question without hesitation because the project has to proceed in a tight time. I hope you will help me a lot if I'm not good enough. Thank you for reading the long article.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bayesian MCMC model and Bayesian regression analysis model #1819

Bayesian MCMC model and Bayesian regression analysis model #1819

Peridein commented Jul 2, 2024

chrism0dwk commented Jul 6, 2024

Peridein commented Jul 8, 2024

Bayesian MCMC model and Bayesian regression analysis model #1819

Bayesian MCMC model and Bayesian regression analysis model #1819

Comments

Peridein commented Jul 2, 2024

chrism0dwk commented Jul 6, 2024

Regression

Bayesian statistics

MCMC

Conclusion

Peridein commented Jul 8, 2024