Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

csv plus loading text in xlm #4

Open
AnnaWeronikaMatysiak opened this issue Oct 12, 2022 · 3 comments
Open

csv plus loading text in xlm #4

AnnaWeronikaMatysiak opened this issue Oct 12, 2022 · 3 comments

Comments

@AnnaWeronikaMatysiak
Copy link

Dear Max,
In the csv you added, there appears to be a mistake. Infant sorrow text has other poems in it. It shows that this particular poem has 111 lines. Not sure if this was done on purpose, so I though I would mention it.

Dear All
I also run into a small issue. I am working in r and I try to access the text spoken in the parliament. However, I cannot access the element J_1 nor J. I managed to get all the names and surnames, but accessing this one is not working for me. I can access the text by using html_text(rede_data), but this gives me all text elements in rede, like names, comments, and party names. I could of cleaned it, but it is additional time, so I am wondering if someone have found a way around?

@mcallaghan
Copy link
Owner

Ah great catch, no this was not intentional - though it is a great example of what can happen if you do not check your results carefully enough :). I will fix this tomorrow, but of course anyone using the old data obviously will not be penalised for the mistake.

@mcallaghan
Copy link
Owner

On getting the J elements, have you tried "attribute selectors": https://www.w3schools.com/css/css_attribute_selectors.asp

for example, 'p[type="speech"]' would select all <p> elements where the "type" attribute is "speech". If you want all the p elements where the "klasse" attribute is "J", then "klasse" is your attribute name, and "J" is the value.

@AnnaWeronikaMatysiak
Copy link
Author

I tried in multiple configurations and it didn't work. I found a way around with nodes<-xml_find_all(rede_links, ".//p"). But this later requires lots of work with joining the text elements plus does not exclude the klasse=redner. If anyone tried this approach and made any progress, let me know!:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants