Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More work on types for tables examples. #779

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

duvenaud
Copy link
Contributor

@duvenaud duvenaud commented Mar 4, 2022

This isn't finished yet, but could be merged now since it's self-contained.

Thanks to Adam's extensions, this is going smoothly so far. It's surprising how often the code is correct once it typechecks. Thanks to punning and lambdas, the code is about as terse as I think is possible. For example, one of the examples is written as:

> isTeenagerBuilder =
    function(r):
      12 < getValue(r, "age") and getValue(r, "age") < 20
    end
> buildColumn(students, "is-teenager", isTeenagerBuilder)

which translates to Dex as:

buildColumn students ##is_teenager \{age, ..._}.
  (12 < age ) && (age < 20)

I'm not sure whether to leave the original examples in as comments. It's nice to see how much shorter the Dex versions are, but the examples and example outputs are pretty long.

I played a bit with indexing by sets of strings, but it's a bit more awkward than I expected. Things like adding a row, or concatenating two tables then sometimes mean changing the index type, which requires lots of type-level computation which isn't supported. I tried to make a set-erased SetIxSet, kind of like an unordered list, but then I don't think there's enough information available at compile time for it to fulfill the Ix interface.

I've been getting random segfaults that crash the web notebook, but everything always works when I restart.

I feel like the records system is getting a bit heavy on the new syntax. Could we potentially replace Fields with simply List Label or n=>Label, and avoid introducing the syntax for creating new Fields?

There are also a couple of minor bugs or inconsistencies here and there in the original examples.

@duvenaud duvenaud changed the title [WIP] More work on types for tables examples. More work on types for tables examples. Mar 7, 2022
Copy link
Collaborator

@apaszke apaszke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

]


'# Experiment: Indexing by sets
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please move the experiment to the end?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. I was originally planning to do the entire file twice, once with indexing by row number and once indexing by row elements, but indexing by row elements doesn't work yet.

-- | "Bob" | 12 | 8 | 9 | 77 | 7 | 9 | 87 |
-- | "Alice" | 17 | 6 | 8 | 88 | 8 | 7 | 85 |
-- | "Eve" | 13 | 7 | 9 | 84 | 8 | 8 | 77 |
:p addRows students [{name="Colton", age=19, favColor="blue"}]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't think you have to use :p anymore.

Also, once this is done, can you please check in the outputs, so that we can turn it into a test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, I was only using it because I thought there was a reason you guys were using it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think we should really delete it... I don't see a good reason to keep it.

-- | true | false | false |
-- | true | false | true |
-- | =True | =False | =False |
-- | =True | =False | =True |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird formatting? Why the = before booleans?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, that's left over from a find-and-replace when I was formatting all the tables to dex tables.

-- | =True | =False | =True |


-- :p crossJoin students petiteJelly
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cross join examples are now attached to subTable, not to crossJoin

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, got it.

(AsList num_found found_table) = argFilter (\x. x == v) xs
case num_found == 0 of
True -> Nothing
False -> Just found_table.(unsafeFromOrdinal _ 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also do this:

data First a = MkFirst (Maybe a)

def FirstMonoid (a:Type) : Monoid (First a) =
  named-instance result : Monoid (First a)
    mempty = MkFirst Nothing
    mappend = \x y. case x of
      Just xv -> x
      Nothing -> y
  result

def findInUnordered {a n} [Eq a] (xs:n=>a) (v:a) : Maybe n =
  MkFirst ans = withAccum \ref.
    for_ i. case xs.i == v of
      True  -> ref += MkFirst (Just n)
      False -> ()
  ans

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, yes I guess that'd be a bit more efficient. I wonder if someday we can have some sort of Lattice typeclass and corresponding effect which lets the compiler know about short-circuiting completions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be pretty neat! We could have a flavor of Accum that works over a lattice and short-circuits (when the computation is otherwise pure?)!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that'd be a good showcase of Dex's effect-aware compilation. I've come across a couple of other examples that could in principle be sped up using a parallel short-circuit:

  • The any and all functions.
  • The Ord instance for lists which returns at the location of first difference

@@ -347,43 +345,35 @@ def getValue {f a} (c:Label) (r:{@c:a & ...f}) : a =
-- ### (overloading 1/2) `getColumn :: t:Table * n:Number -> vs:Seq<Value>`

-- ### (overloading 2/2) `getColumn :: t:Table * c:ColName -> vs:Seq<Value>`
def getColumn {f n a} (c:Label) (t:n=>{@c:a & ...f}) : n=>a = todo

-- Define above.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Defined

-- > head(students, -2)

:p selectColumns {favColor:_ ? age:_} students
:p selectColumns {final:_ ? name:_ ? midterm:_} gradebook
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this replace subTable from above? I think at some point the benchmark authors might have renamed subTable to selectColumns and only forgot to update that one example. Let's use this name instead.

(UnsafeAsSet _ uniquetable) = toSet t
AsList _ uniquetable

:p distinct students
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that work? I would imagine that we don't have an Ord instance for records?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't work yet. I think 3 of the examples parse but still fail with one error or another that I'm hoping to chat with you about.

Sorry, I should have been clearer that this PR was in the same spirit as the one you merged which introduced this file, which is "this doesn't fully work yet but is better than what came before." I learned it from you! :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, WIP is totally fine. I was only a bit confused, because it looks as if it was supposed to work, but I couldn't figure out how :)

xlists = for i:n. (AsList 1 ([(xs.i, 1)]))
reduce (AsList 0 []) mergeUniqueSortedListsWithCounts xlists

def count {n a f} [Ord a] (c:Label) (t:n=>{@c:a & ...f}) : List {value:a & count:Int} =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another place where a mutable HashMap would probably be a better fit

True ->
concat [allExceptLast xs,
(AsList 1 [(last_x, concat [x_b_list, y_b_list])]),
unsafeAllExceptFirst ys]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is duplicated way too many times. Is there a nice generic version we could write once?

Copy link
Contributor Author

@duvenaud duvenaud Mar 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, certainly. I'll unify these the next time I work on this. I just wanted to write the specific instances first so that I'd know what the general version needed to look like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants