Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ORF] Extractor overhaul #32802

Merged
merged 7 commits into from
Jun 11, 2024
Merged

[ORF] Extractor overhaul #32802

merged 7 commits into from
Jun 11, 2024

Conversation

dirkf
Copy link
Contributor

@dirkf dirkf commented Jun 1, 2024

Boilerplate: own/yt-dlp code, fix/new extractor

Please follow the guide below

  • You will be asked some questions, please read them carefully and answer honestly
  • Put an x into all the boxes [ ] relevant to your pull request (like that [x])
  • Use Preview tab to see how your pull request will actually look like

Before submitting a pull request make sure you have:

In order to be accepted and merged into youtube-dl each piece of code must be in public domain or released under Unlicense. Check one of the following options:

  • I am the original author of this code and I am willing to release it under Unlicense, except for code from yt-dlp for which this or the below has been asserted.
  • I am not the original author of this code but it is in public domain or released under Unlicense (provide reliable evidence)

What is the purpose of your pull request?

  • Bug fix
  • Improvement
  • New extractor
  • New feature

Description of your pull request and other information

This PR overhauls and brings up-to-date the ORF extractor module.

TV is now handled by the on.orf.at single-page-app, replacing the former TVthek. The PR back-ports the ORFONIE recently added to yt-dlp to replace the broken ORFTVthekIE: thx @HobbyistDev, @TuxCoder, @seproDev. Resolves #32798.

Some shows (former extended live events?) may be returned as multi_video playlists: yt-dl has no support for automatic concatenation of multi_video playlists (yet), but if the same format is selected for all items (segments) it should be possible to concatenate the A-V streams without re-encoding using ffmpeg.

Livestream support (not widely tested) has been added. See also yt-dlp/yt-dlp#10052.

Similarly radio and audio clips are now handled through sound.orf.at. The ORFRadioIE extractor has been updated and made public to support this as well as residual xx.orf.at/player/... URLs. The previous per-station IEs are removed. Resolves #29394. See also yt-dlp/yt-dlp#5265, yt-dlp/yt-dlp#9581.

ORFRadioCollectionIE is added to support playlists in ORF Sound.

The PR also back-ports and re-works ORFPodcastIE from yt-dlp: thx @Esokrates.

ORFFM4StoryIE is fixed and also now gets any in-page YT media: see yt-dlp/yt-dlp#2477.

Core YoutubeDL processing is tweaked to support some not previously encountered playlist structures.

* skip reason can't be unicode in Py2
* remove duplicate assert...Equal functions
* add `ORFONIE`, back-porting yt-dlp PR yt-dlp/yt-dlp#9113 and friends: thx HobbyistDev, TuxCoder, seproDev
* re-factor to support livestreams via new `ORFONliveIE`
* maintain support for xx.orf.at/player/... URLs
* add `ORFRadioCollectionIE` to support playlists in ORF Sound
* back-port and re-work `ORFPodcastIE` from yt-dlp/yt-dlp#8486, thx Esokrates
* fix getting media via DASH instead of inaccessible mp4
* also get in-page YT media
@dirkf dirkf merged commit a48fe74 into ytdl-org:master Jun 11, 2024
14 checks passed
@jkirk
Copy link

jkirk commented Jun 16, 2024

JFTR, I've tested 0153b38.

Both URLs, oe1.orf.at and sound.orf.at do work:

❯ python3 -m youtube_dl "https://oe1.orf.at/player/20240616/760429/1718557473314"
[orf:sound] 760429: Downloading JSON metadata
[download] Downloading playlist: Die Wiener Swing-Institution
[orf:sound] playlist Die Wiener Swing-Institution: Downloading 1 videos
[download] Downloading video 1 of 1
[download] Destination: Die Wiener Swing-Institution-2024-06-16_1904_tl_51_7DaysSun32_2424227.mp3
[download]   2.2% of 55.61MiB at 258.91KiB/s ETA 03:35^C
ERROR: Interrupted by user

❯ python3 -m youtube_dl "https://sound.orf.at/radio/oe1/sendung/196537/die-wiener-swing-institution"
[orf:sound] 196537: Downloading JSON metadata
[download] Downloading playlist: Die Wiener Swing-Institution
[orf:sound] playlist Die Wiener Swing-Institution: Downloading 1 videos
[download] Downloading video 1 of 1
[download] Resuming download at byte 1303100
[download] Destination: Die Wiener Swing-Institution-2024-06-16_1904_tl_51_7DaysSun32_2424227.mp3
[download]   4.0% of 55.62MiB at 262.38KiB/s ETA 03:28^C
ERROR: Interrupted by user

Note, the links are (should be) available for 30 days from within Austria.

@jkirk
Copy link

jkirk commented Jul 30, 2024

JFTR, I did another test and tried to download the following URL https://on.orf.at/video/14236058/olympische-spiele-paris-2024-die-eroeffnung.

❯ python3 -m youtube_dl --write-info-json -o "%(autonumber)s-%(title)s.%(ext)s" "https://on.orf.at/video/14236058/olympische-spiele-paris-2024-die-eroeffnung"
[orf:on] 14236058: Downloading webpage                                                                                                                                                                               
[orf:on] 14236058: Downloading JSON metadata                                                              
[orf:on] Downloading m3u8 information                                                                     
[orf:on] Downloading m3u8 information                                                                     
[orf:on] Downloading MPD manifest                                                                                                                                                                                    
[orf:on] Downloading MPD manifest                                                                         
[orf:on] Downloading m3u8 information                                                                                                                                                                                
[orf:on] Downloading m3u8 information                                                                     
[orf:on] Downloading MPD manifest                                                                         
[orf:on] Downloading MPD manifest                                                                         
[orf:on] Downloading m3u8 information                                                                     
[orf:on] Downloading m3u8 information               
[orf:on] Downloading MPD manifest                                                                         
[orf:on] Downloading MPD manifest                                                                         
[orf:on] Downloading m3u8 information                                                                     
[orf:on] Downloading m3u8 information                                                                     
[orf:on] Downloading MPD manifest                                                                                                                                                                                    
[orf:on] Downloading MPD manifest              
[...]
[download] Downloading playlist: Olympische Spiele Paris 2024: Die Eröffnung
[orf:on] playlist Olympische Spiele Paris 2024: Die Eröffnung: Collected 16 video ids (downloading 16 of them)
[download] Downloading video 1 of 16
[info] Writing video description metadata as JSON to: 00001-Bootsparade - Einzug der Athleten.info.json
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 120
[...]

It seems to work, but I noticed that the I got the "second" audio stream (Audio description, (AD)), instead of the main audio stream. Not sure why.

If it helps, this is what I got after everything was downloaded:

❯ python3 -m youtube_dl -vUF --write-info-json -o "%(autonumber)s-%(title)s.%(ext)s" "https://on.orf.at/video/14236058/olympische-spiele-paris-2024-die-eroeffnung"
[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-vUF', '--write-info-json', '-o', '%(autonumber)s-%(title)s.%(ext)s', 'https://on.orf.at/video/14236058/olympische-spiele-paris-2024-die-eroeffnung']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Git HEAD: 0153b387e
[debug] Python 3.11.2 (CPython x86_64 64bit) - Linux-6.1.0-23-amd64-x86_64-with-glibc2.36 - OpenSSL 3.0.13 30 Jan 2024 - glibc 2.36
[debug] exe versions: ffmpeg 5.1.5-0, ffprobe 5.1.5-0
[debug] Proxy map: {}
It looks like you installed youtube-dl with a package manager, pip, setup.py or a tarball. Please use that to update.
[orf:on] 14236058: Downloading webpage
[orf:on] 14236058: Downloading JSON metadata
[...]
[orf:on] Downloading m3u8 information
[orf:on] Downloading MPD manifest
[orf:on] Downloading MPD manifest
[download] Downloading playlist: Olympische Spiele Paris 2024: Die Eröffnung
[orf:on] playlist Olympische Spiele Paris 2024: Die Eröffnung: Collected 16 video ids (downloading 16 of them)
[download] Downloading video 1 of 16
[info] Available formats for 15686542:
format code                                extension  resolution note
hls-audio-Deutsch-0                        mp4        audio only [de]
hls-audio-Deutsch-1                        mp4        audio only [de]
hls-audio-Deutsch_Audiodeskription__AD_-0  mp4        audio only [de]
hls-audio-Deutsch_Audiodeskription__AD_-1  mp4        audio only [de]
dash-p0aa0br192000-0                       m4a        audio only [de] DASH audio  192k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-p0aa0br192000-1                       m4a        audio only [de] DASH audio  192k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-p0aa1br192000-0                       m4a        audio only [de] DASH audio  192k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-p0aa1br192000-1                       m4a        audio only [de] DASH audio  192k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-p0va0br801433-0                       mp4        640x360    DASH video  801k , mp4_dash container, avc1.64001e, 25fps, video only
dash-p0va0br801433-1                       mp4        640x360    DASH video  801k , mp4_dash container, avc1.64001e, 25fps, video only
hls-992-0                                  mp4        640x360     992k , video only
hls-992-1                                  mp4        640x360     992k , video only
dash-p0va0br1801486-0                      mp4        960x540    DASH video 1801k , mp4_dash container, avc1.64001f, 25fps, video only
dash-p0va0br1801486-1                      mp4        960x540    DASH video 1801k , mp4_dash container, avc1.64001f, 25fps, video only
hls-1992-0                                 mp4        960x540    1992k , video only
hls-1992-1                                 mp4        960x540    1992k , video only
dash-p0va0br3001472-0                      mp4        1280x720   DASH video 3001k , mp4_dash container, avc1.64001f, 25fps, video only
dash-p0va0br3001472-1                      mp4        1280x720   DASH video 3001k , mp4_dash container, avc1.64001f, 25fps, video only
hls-3192-0                                 mp4        1280x720   3192k , video only
hls-3192-1                                 mp4        1280x720   3192k , video only (best)
[download] Downloading video 2 of 16
[...]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error: "Unsupported URL" because of website change ORF ORF Radiothek has changed the URLs
2 participants