Commit Graph

35 Commits

Author SHA1 Message Date
Andy
15acaea208 feat(dl): extract closed captions from HLS manifests and improve CC extraction
- Parse CLOSED-CAPTIONS entries from HLS manifests and attach CC metadata (language, name, instream_id) to video tracks
- Move CC extraction to run after decryption instead of before, fixing extraction failures on encrypted streams
- Extract CCs even when other subtitle tracks exist, using manifest CC language info instead of guessing
- Try ccextractor on the original file before repacking to preserve container-level CC data (e.g. c608 boxes) that ffmpeg remux strips
- Display deduplicated closed captions in --list output and download progress, positioned after subtitles
- Add closed_captions field to Video track class
2026-03-05 15:57:29 -07:00
Andy
d1e6d0812c fix(dash): pass period_filter to n_m3u8dl_re via filtered MPD file
The period_filter in DASH.to_tracks() only affected track listing but had no effect on n_m3u8dl_re downloads, which re-parsed the raw MPD and downloaded all periods including ads/pre-rolls. This caused DRM decryption failures and corrupted video output.
When periods are filtered during to_tracks(), write a filtered MPD (with rejected periods removed) to a temp file and pass it to n_m3u8dl_re via track.from_file.

Closes #51
2026-03-01 13:18:27 -07:00
Andy
c10257b8dc Revert "feat(debug): add JSONL debug logging to decryption, muxing, and all downloaders"
This reverts commit cc89f4ca93.
2026-02-17 14:37:50 -07:00
Andy
cc89f4ca93 feat(debug): add JSONL debug logging to decryption, muxing, and all downloaders
Expand debug logging coverage for better diagnostics when investigating download/decryption issues like QUICKTIME/cbcs problem.
2026-02-17 13:58:36 -07:00
Andy
3ee554401a feat(HLS): improve audio codec handling with error handling for codec extraction 2026-02-10 08:34:54 -07:00
Andy
5650c2b591 fix(hls): remove no-op encryption_data reassignment 2026-02-08 10:43:49 -07:00
Andy
6b8a8ba8a8 feat(cdm): normalize CDM detection for local and remote implementations
Add unshackle.core.cdm.detect helpers to classify CDMs consistently across local and remote backends.

- Add is_playready_cdm/is_widevine_cdm for DRM selection across pyplayready, pywidevine, and wrappers

- Add is_remote_cdm/is_local_cdm/cdm_location so services can branch on CDM execution location

- Switch core DASH/HLS parsing, track DRM selection, and dl CDM switching away from brittle isinstance/DecryptLabs-only checks

- Make unshackle.core.cdm import-light via lazy __getattr__ so optional CDM deps are only imported when needed
2026-02-08 00:37:53 -07:00
Andy
96411e5d7d fix(hls): keep range offset numeric and align MonaLisa licensing
- Parse init section byterange offset as int to avoid string arithmetic bugs

- Wrap MonaLisa licensing in the same progress + error handling flow as Widevine/PlayReady
2026-02-07 19:44:23 -07:00
Andy
ace89760e7 fix(hls): finalize n_m3u8dl_re outputs
- Add a small helper to move N_m3u8DL-RE final outputs into the expected temp path (preserve actual suffix) and keep subtitle codec consistent with the produced file.
- Skip generic HLS segment merging when N_m3u8DL-RE is in use to avoid mixing in sidecar files and reduce Windows file-lock issues.
- Harden segmented WebVTT merging to avoid IndexError when caption segment indexes exceed the provided duration list.
2026-02-06 16:17:06 -07:00
CodeName393
3fa4a81a39 Fix Missing HLS Curl Session Processing 2026-02-05 19:25:51 +09:00
Andy
1cde8964c1 fix(dash): preserve MPD DRM instead of overwriting from init segment
The init_data DRM extraction was unconditionally overwriting DRM already extracted from MPD ContentProtection elements. This caused failures when init segments contain malformed PSSH data while the MPD has valid PSSH.

Now only falls back to init_data extraction when no DRM was found from the manifest, matching the behavior in version 2.1.0.
2026-02-02 12:02:16 -07:00
Andy
cc55fd8922 fix(dash): add CENC namespace support for PSSH extraction
Some MPD manifests use the cenc: namespace prefix for PSSH elements (e.g., <cenc:pssh>) instead of non-namespaced <pssh>. This caused DRM extraction to fail for services.

- Add {urn:mpeg:cenc:2013}pssh fallback for Widevine PSSH extraction
- Add {urn:mpeg:cenc:2013}pssh fallback for PlayReady PSSH extraction
2026-02-02 10:59:15 -07:00
Andy
84466e12de Merge branch 'feat/monalisa-drm' into dev 2026-02-02 08:25:43 -07:00
Andy
6dd1ce6df9 fix(dash): handle high startNumber in SegmentTimeline for DVR manifests
When a DASH manifest has a high startNumber (common in DVR/catch-up content from live streams), the segment range calculation would produce an empty range because end_number was set to len(segment_durations) rather than being offset by startNumber.
2026-02-01 11:12:29 -07:00
Andy
d0d8044fb3 feat(video): detect interlaced scan type from MPD manifests 2026-01-31 23:51:57 -07:00
Andy
3fcad1aa01 feat(drm): add MonaLisa DRM support to core infrastructure
- Add MonaLisaCDM class wrapping wasmtime for key extraction
- Add MonaLisa DRM class with decrypt_segment() for per-segment decryption
- Display Content ID and keys in download output (matching Widevine/PlayReady)
- Add wasmtime dependency for WASM module execution
2026-01-31 22:05:44 -07:00
Andy
8c8c9368ba fix(manifests): correct DRM type selection for remote PlayReady CDMs
HLS: Filter segment keys by CDM type during aria2c merge phase to prevent incorrect Widevine selection when using PlayReady-only CDMs. The merge phase now uses filter_keys_for_cdm() before get_supported_key(), matching the pattern used in initial licensing.

DASH: Extend PlayReady CDM detection to include remote CDMs with is_playready attribute, not just native PlayReadyCdm instances. This ensures correct DRM extraction order from init_data when using remote PlayReady CDMs.
2026-01-29 10:34:03 -07:00
Andy
e3767716f3 feat(debug): add download output verification logging
Add comprehensive debug logging to diagnose N_m3u8DL-RE download failures where the process exits successfully but produces no output files.
2026-01-24 10:36:43 -07:00
Andy
766447cd71 fix(hls): prefer media playlist keys over session keys for accurate KID matching
Session keys from master playlists often contain PSSHs with multiple KIDs covering all tracks, causing licensing to return keys for wrong KIDs.

Changes:
- Unified DRM licensing logic for all downloaders
- Prefer media playlist EXT-X-KEY tags which contain track-specific KIDs
- Add filter_keys_for_cdm() to select keys matching configured CDM type
- Add get_track_kid_from_init() to extract KID from init segment with fallback to drm.kid from PSSH
- Track initial_drm_key to prevent double-licensing on first segment
- Simplify n_m3u8dl_re block to reuse common licensing flow
- Use strict PlayReady keyformat matching via PR_PSSH.SYSTEM_ID URN instead of loose substring match
- Fix PlayReady keyformat comparisons that incorrectly compared strings to PlayReadyCdm class
- Fix byterange header format in get_track_kid_from_init() to use HLS.calculate_byte_range()

Also fixes PlayReady keyformat matching in:
- unshackle/core/tracks/track.py
- unshackle/core/drm/playready.py

Fixes download failures where track_kid was null or mismatched, causing wrong content keys to be obtained during PlayReady/Widevine licensing.
2026-01-21 00:20:07 +00:00
Andy
a01f335cfc fix(dash): handle N_m3u8DL-RE merge and decryption
- Add skip_merge flag for N_m3u8DL-RE to prevent duplicate init data
- Pass content_keys to N_m3u8DL-RE for internal decryption handling
- Use shutil.move() instead of manual merge when skip_merge is True
- Skip manual decryption when N_m3u8DL-RE handles it internally

Fixes audio corruption ("Box 'OG 2' size is too large") when using N_m3u8DL-RE with DASH manifests that have SegmentBase init data. The init segment was being written twice: once by N_m3u8DL-RE during its internal merge, and again by dash.py during post-processing.
2026-01-16 13:25:34 +00:00
Andy
b01fc3c8d1 fix(dash): handle placeholder KIDs and improve DRM init from segments
- Add CENC namespace support for kid/default_KID attributes
- Detect and replace placeholder/test KIDs in Widevine PSSH:
  - All zeros (key rotation default)
  - Sequential 0x00-0x0f pattern
  - Shaka Packager test pattern
- Change DRM init condition from `not track.drm` to `init_data` to ensure DRM is always re-initialized from init segments

Fixes issue where Widevine PSSH contains placeholder KIDs while the real KID is only in ContentProtection default_KID attributes.
2026-01-15 12:50:22 +00:00
Andy
a7b6e9e680 feat(drm): add CDM-aware PlayReady fallback detection
Add PlayReady PSSH/KID extraction from track and init data with CDM-aware ordering. When PlayReady CDM is selected, tries PlayReady first then falls back to Widevine. When Widevine CDM is selected (default), tries Widevine first then falls back to PlayReady.
2026-01-15 02:49:56 +00:00
Andy
17a91ee4bb feat(debug): add comprehensive debug logging for downloaders and muxing 2026-01-05 09:50:33 +00:00
Andy
2d4bf140fa fix(dash): add AdaptationSet-level BaseURL resolution
Add support for BaseURL elements at the AdaptationSet level per DASH spec. The URL resolution chain now properly follows: MPD → Period → AdaptationSet → Representation.
2025-11-25 16:09:28 +00:00
Andy
492134b8ff fix(hls): convert range_offset to int to prevent TypeError
Fixed TypeError in calculate_byte_range where range_offset was a string instead of int. The byte_range.split("-")[0] returns a string, but the calculate_byte_range method expects fallback_offset parameter to be int.
2025-11-14 23:08:13 +00:00
stabbedbybrick
9ed5133c4c N_m3u8DL-RE: Improve track selection, add download arguments and option to load manifest from file (#38)
* feat: Add 'from_file', 'downloader_args' to Track

* feat: Add loading HLS playlist from file

* refactor: Improve track selection, args for n_m3u8dl_re
2025-11-08 13:57:52 -07:00
Andy
27d0ca84a3 fix(dash): correct segment count calculation for startNumber=0
Fix off-by-one error in SegmentTemplate segment enumeration when startNumber is 0. Previously, the code would request one extra segment beyond what exists, causing 404 errors on the final segment.

The issue was that end_number was calculated as a segment count via math.ceil(), but then used incorrectly with range(start_number, end_number + 1), treating it as both a count and an inclusive endpoint.

Changed to explicitly calculate segment_count first, then derive end_number as: start_number + segment_count - 1

Example:
- Duration: 3540.996s, segment duration: 4s
- Before: segments 0-886 (887 segments) - segment 886 doesn't exist
- After: segments 0-885 (886 segments) - correct
2025-11-02 20:30:06 +00:00
TPD94
087df59fb6 Update hls.py 2025-10-21 21:07:24 -04:00
TPD94
03f08159b4 Update dash.py 2025-09-29 21:01:55 -04:00
TPD94
2e2f8f5099 Fix remoteCDM, add curl_cffi to instance check 2025-09-29 20:48:59 -04:00
Andy
35efdbff6d feat: add curl_cffi session support with browser impersonation
Add new session utility with curl_cffi support for anti-bot protection
Update all manifest parsers (DASH, HLS, ISM, M3U8) to accept curl_cffi sessions
Add browser impersonation support (Chrome, Firefox, Safari)
Fix cookie handling compatibility between requests and curl_cffi
Suppress HTTPS proxy warnings for better UX
Maintain full backward compatibility with requests.Session
2025-09-25 06:27:14 +00:00
Andy
4006593a8a Fix: Implement lazy DRM loading for multi-track key retrieval
- Add deferred DRM loading to M3U8 parser to mark tracks for later processing
- Optimize prepare_drm to load DRM just-in-time during download process
2025-09-12 06:38:14 +00:00
Andy
eac2ff4cee feat(hls): Enhance segment retrieval by allowing all file types and clean up empty segment directories. Fixes issues with VTT files from HLS not being found correctly due to new HLS "changes" 2025-08-12 20:25:42 +00:00
Andy
798b5bf3cd feat(hls): Enhance segment merging with recursive file search and fallback to binary concatenation 2025-08-11 03:53:17 +00:00
Andy
d37014f53f Initial Commit 2025-07-18 00:46:05 +00:00