Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent field quoting in CSV response (at least for functions) #3621

Closed
ggam opened this issue Jun 26, 2024 · 1 comment
Closed

Inconsistent field quoting in CSV response (at least for functions) #3621

ggam opened this issue Jun 26, 2024 · 1 comment
Labels

Comments

@ggam
Copy link

ggam commented Jun 26, 2024

Environment

  • PostgreSQL version: 15.3
  • PostgREST version: 12.2
  • Operating system: Debian

Description of issue

I have the following function definition:

CREATE TYPE schema.function_return_type AS (
	aa_numeric numeric,
	aa_text varchar,
	xx_text text,
	ddddd_timestamp timestamp,
	yyyy_date date,
	zzzzz_text text);

CREATE OR REPLACE FUNCTION schema.postgrest_function()
 RETURNS SETOF schema.function_return_type
 LANGUAGE sql
 IMMUTABLE PARALLEL SAFE
AS $function$
    select *
    from tableblabla

    $function$
;

And request text/csv content:

curl --location 'http://localhost:3000/rpc/postgrest_function' --header 'Accept: text/csv'

The response handles field types inconsistently:

aa_numeric,aa_text,xx_text,ddddd_timestamp,yyyy_date,zzzzz_text
35178153,999916286232,011-XXX,"2023-12-13 10:21:48",2024-02-21,"99999999-38 XXXXX"

aa_text is returned unquoted as it doesn't contain any non-numeric character. xx_text is also unquoted even if it has a hyphen and text. zzzzz_text has a similar pattern to xx_text but is quoted. Dates are correctly unquoted, but timestamps are quoted.

My expection is that all numeric and date-related fields would be unquoted. The rest would be quoted. I haven't tried if this problem exists also on tables/views.

For the time being, I guess I'll need to duplicate all the functions with a separate text/csv domain in order to get a consistent behaviour. Or is there any way to override the default implementation?

@laurenceisla
Copy link
Member

laurenceisla commented Jun 27, 2024

AFAICS, there isn't a standard for CSV, the closest one being the RFC 4180. There, it says that "Each field may or may not be enclosed in double quotes". It does not mention quoting according to the type.

The response handles field types inconsistently:

It looks that way because it doesn't take into consideration the type at all. The response is a result of casting a PostgreSQL composite type (a table row in this case) to text. For instance:

select  row('text', 'text with space', 'text with " quote');

-- Result:
-- (text,"text with space","text with "" quote")

Some are wrapped because they have certain special characters in the value (e.g. " or space).

My expection is that all numeric and date-related fields would be unquoted. The rest would be quoted.

Since this doesn't adhere to the RFC, I'm sure we won't be implementing this in PostgREST either. I think your CSV parser should understand quoted values when present regardless of the type.

Also, issue #1374, which also had problems with the CSV output, was closed with Overriding a Built-in Handler as the solution. So this would be your workaround if you need to implement it.

Due to both of the points above, I'll be closing this. Feel free to reopen if you disagree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

2 participants