Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix splitting of toplevel phrases #1533

Open
pitag-ha opened this issue Apr 15, 2021 · 14 comments
Open

Fix splitting of toplevel phrases #1533

pitag-ha opened this issue Apr 15, 2021 · 14 comments
Labels
medium More Complex Issues for Outreachy

Comments

@pitag-ha
Copy link
Member

There is a FIXME in the code generating html from toplevel blocks. Notice that in the OCaml toplevel (in other languages called REPL), the end of a code phrase is signalized by ;;. I suggest fixing the mentioned FIXME, which consists in two things:

  • at the moment every ;; is split on, even if that ;; belongs to a string inside the code. That needs to be fixed
  • if it was possible to find a more efficient way to do the splitting than using regex, for example using String.split_on_char and/or similar, that would be great.
@patricoferris
Copy link
Contributor

It would also be cool if someone could demonstrate some rough benchmarks for different approaches to improving the efficiency too :))

@patricoferris patricoferris added the medium More Complex Issues for Outreachy label Apr 16, 2021
@yashi-hub
Copy link
Contributor

hey, @patricoferris @pitag-ha I'm interested in working on this issue , can I give it a try :))

@Riddhima23
Copy link
Contributor

Hey! @pitag-ha @patricoferris ! I wanted to work on a "medium" issue. This looks good to me! Can I give it a try :)

@pitag-ha
Copy link
Member Author

Sure @yashi-hub, you can go ahead :)

@pitag-ha
Copy link
Member Author

Oh wait, I've reacted a bit too fast. Sorry about that! @yashi-hub: I've just remembered that you are already working on an issue (#1402), aren't you?

@yashi-hub
Copy link
Contributor

@pitag-ha i am almost done with that issue. I believe I will create a PR by tomorrow for that.
And I'll be able to work efficiently for this issue :)

@pitag-ha
Copy link
Member Author

Thanks @yashi-hub! Please open a PR for #1402 before starting to work on another issue. We want folks to focus on issues one by one. By the way, you've done a great job so far, thanks for that! :)

@pitag-ha
Copy link
Member Author

@Riddhima23, sorry for the confusion! You can go ahead and work on this :)

@Riddhima23
Copy link
Contributor

Alright @pitag-ha! Thanks! I'll reach out if I have any doubt :))

@Riddhima23
Copy link
Contributor

@pitag-ha what I have to do is basically to find a more efficient way of splitting strings instead of using regular expressions. For example, let the string be"Ocaml;;great" now I want the output as "Ocaml;;great" but using regular expression(current method) , output would be somewhat " Ocaml" "great" means it splits the words instead of printing it as a whole which needs to be fixed. Please confirm if I have understood it correctly or not :)

@pitag-ha
Copy link
Member Author

Hi @Riddhima23, sorry for the late reply!

let the string be"Ocaml;;great" now I want the output as "Ocaml;;great" but using regular expression(current method)

I don't exactly understand this sentence. What is it that you mean?

I have the impression that you've understood correctly, but I'm not sure :) Just to make sure, let me give a concrete example building on your example. Suppose you have the following toplevel block:

print_endline "OCaml;;great";;
print_endline "OCaml;;super great";;

We want that block to be split into the two toplevel phrases
print_endline "OCaml;;great",
print_endline "OCaml;;super great"
What happens at the moment is that it is split into the four senseless phrases
print_endline "OCaml,
great",
print_endline "OCaml,
super great".

So one part of this issue consists in fixing that.

The other part of the issue consists in perform the splitting in a more efficient way than done now (as you've said, now it's done via regex). As I've mentioned in the issue description, for that you you can see if you can use functions like String.split_on_char or similar. Once you've found a different way for the splitting, it would be cool if you could do some benchmarking both on the old way (regex) and on your new way to compare the two.

@Riddhima23
Copy link
Contributor

@pitag-ha Thanks for the example.I meant something like this but maybe couldn't put it :) Now I have completely understood the issue.I'll get started and let you know if I have any further query :))

@gurleennsidhuu
Copy link
Contributor

@pitag-ha Thanks for the example.I meant something like this but maybe couldn't put it :) Now I have completely understood the issue.I'll get started and let you know if I have any further query :))

Hi @Riddhima23. Are you still working on this issue? :)

@Riddhima23
Copy link
Contributor

@gurleennsidhuu yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
medium More Complex Issues for Outreachy
Projects
None yet
Development

No branches or pull requests

5 participants