Inference in READ commands

Hello again !

I’m looking at your simple example in the Tutorial and the Example demo products, can you please explain the difference between using a READ() command with and without Inference?

For instance, I run the following in Scratchpad:

  SELECT party, type, gender, state
  FROM 
    READ_CSV("https://theunitedstates.io/congress-legislators/legislators-current.csv")

and RAW looks like it is using inference, as this file has headers, they are not appearing in the results set.

If I use the same file again like the code below, and give it a typealias, then does that mean that inference is switched off? because unless I specify skip:=1, I can see the header row in the data itself.


typealias first_cols := record(
    `last_name`: string,
    `first_name`: string,
    `middle_name`: string nullable,
    `suffix`: string nullable,
    `nickname`: string nullable,
    `full_name`: string,
    `birthday`: string nullable,
    `gender`: string,
    `type`: string,
    `state`: string,
    `district`: string nullable,
    `senate_class`: string nullable,
    `party`: string
);


  SELECT party, type, gender, state
  FROM 
    READ_CSV[first_cols]("https://theunitedstates.io/congress-legislators/legislators-current.csv",skip:=1)

Please confirm when inference is switched off?
E

Hello Elliot,

You are correct: specifying a type between brackets in READ_CSV, disables inference. Instead of inspecting the file to figure out the parsing rules, RAW runs its parser using that type to parse columns. It also uses default settings regarding the header (skip := 0).

The same applies to all parameters that are inferred. For example, comma is used as a delimiter, UTF-8 as encoding. Those aren’t going to be inferred.

All these default values used when inference is disabled are documented in the READ_CSV section of the reference guide.

thanks very much ! :smiley: