Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data matrix losing dimensionality in generate_quantities() #965

Open
gowerc opened this issue May 9, 2024 · 2 comments
Open

Data matrix losing dimensionality in generate_quantities() #965

gowerc opened this issue May 9, 2024 · 2 comments

Comments

@gowerc
Copy link
Contributor

gowerc commented May 9, 2024

Describe the issue

Sorry I wasn't sure where to post as I can't tell if this is a bug or not. At least I can't replicate the issue with a simpler bit of code so it very well maybe an issue on my end...

Essentially I am trying to make use of the model$generate_quantities() method but am running into issues with data conversion between R -> Stan. That is...

model$generate_quantities(
    data = data,
    fitted_params = object@results
)

Chain 1 Exception: mismatch in dimension declared and found in context;
processing stage=data initialization; variable name=gq_link_function_inputs;
position=0; dims declared=(21,4); dims found=(84,1)

The error message also points to this line of Stan code in the data block:

    int <lower=1> gq_n_quant;
    int<lower=0> gq_n_par;
    matrix[gq_n_quant, gq_n_par] gq_link_function_inputs;     // <--------------

But when inspecting the data object being passed into method call using browser() I can see:

dim(data[["gq_link_function_inputs"]])
[1] 21  4

data[["gq_link_function_inputs"]]
      [,1] [,2] [,3] [,4]
 [1,] 50   0.6  0.3  0.5 
 [2,] 50   0.6  0.3  0.5 
 [3,] 50   0.6  0.3  0.5 
 [4,] 50   0.6  0.3  0.5 
 [5,] 50   0.6  0.3  0.5 
 [6,] 50   0.6  0.3  0.5 
 [7,] 50   0.6  0.3  0.5 
 [8,] 50   0.6  0.3  0.5 
 [9,] 50   0.6  0.3  0.5 
[10,] 50   0.6  0.3  0.5 
[11,] 50   0.6  0.3  0.5 
[12,] 50   0.6  0.3  0.5 
[13,] 50   0.6  0.3  0.5 
[14,] 50   0.6  0.3  0.5 
[15,] 50   0.6  0.3  0.5 
[16,] 50   0.6  0.3  0.5 
[17,] 50   0.6  0.3  0.5 
[18,] 50   0.6  0.3  0.5 
[19,] 50   0.6  0.3  0.5 
[20,] 50   0.6  0.3  0.5 
[21,] 50   0.6  0.3  0.5

class(data[["gq_link_function_inputs"]])
[1] "matrix" "array"

data[["gq_n_par"]]
[1] 4

data[["gq_n_quant"]]
[1] 21

Any idea what might be going on or if theres anything I can run to get more diagnostic information? As mentioned I tried setting up a minimal example but wasn't able to replicate the issue on a smaller scale bit of code sorry.

CmdStanR version number
‘0.7.1’

@gowerc
Copy link
Contributor Author

gowerc commented May 10, 2024

Ok finally managed to create a minimal reproducible example. Looks more like a bug with my R code but its still a bit weird...

stan_code_2 <- "
data {
    vector[50] x;
    matrix[3, 2] mat;
}
parameters {
    real mu_x;
}
model {
    target += normal_lpdf(x | mu_x, 2);
}
"

mod2 <- cmdstan_model(
    stan_file = cmdstanr::write_stan_file(stan_code_2)
)

mat <- structure(list(0, 0, 0, 0, 0, 0), .Dim = c(3L, 2L))

zres <- mod2$sample(
    data = list(
        x = rnorm(50, 4, 2),
        mat = mat
    )
)
zres

The issue is that the matrix I've created is malformed. Normally the dput of a matrix will be:

mat <- structure(c(0, 0, 0, 0, 0, 0), .Dim = c(3L, 2L))

Note the use of c() instead of list(). I'm not entirely sure how my matrix got into this state but it seems to trip cmdstanr up. That being said I'm not really sure this is a bug with cmdstanr per se as this doesn't appear to be a valid matrix as nearly all standard matrix operations fail e.g.

> mat <- structure(list(0, 0, 0, 0, 0, 0), .Dim = c(3L, 2L))
> mat + 1
Error in mat + 1 : non-numeric argument to binary operator

@gowerc
Copy link
Contributor Author

gowerc commented May 10, 2024

Sorry final message. Turns out my matrix was "broken" as I was constructing it from a list of data e.g.

x <- matrix(
    list(1, 2, 3, 4),
    nrow = 2,
    ncol = 2,
)

I guess this is just very unintuitive base R behaviour...

Would be awesome if cmdstanr could be updated to still make use of objects like these given its still valid in that it has all the needed data; but I understand this is a pretty niche edge case and may open a can of worms on error handling (e.g. case where a cell has 2 list entries) so no worries if you feel there is nothing to be done here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant