User Details
- User Since
- Oct 15 2014, 8:30 PM (523 w, 23 h)
- Roles
- Disabled
- IRC Nick
- bvibber
- LDAP User
- Unknown
- MediaWiki User
- Unknown
May 25 2024
May 10 2024
Apr 3 2024
Feb 20 2024
Fix deployed and I've started a batch job re-running files that may have been affected in the last ~week. Resolved yay \o/
Thanks, in process of creating dev account bvibber ...
This smells like PHP seeing "0" and evaluating it as boolean false, a common error when intending to test for non-empty strings. (PHP's coercion rules evaluate "0" as the number 0, which then evaluates to a boolean false, when it's used as a bare condition.)
Playlist rewrites ran after the code fix went out... and confirmed, this seems to be fixed now. In iOS 15 the app will show low-res videos, but it will show them now. :)
Scheduled for backport window today, then I'll re-run affected generated tracks
Feb 19 2024
Feb 14 2024
We're considering raising the file size limit to 5 GB in T357184, please feel free to follow the technical discussion there.
merging to T357184
Feb 13 2024
I had no issue using ffmpeg with this clip on my local machine; could this be an issue in an older version of ffmpeg deployed on WMF servers?
Proposed patches to parsoid & extension:linter above add the "missing-image-alt-text" lint checking for missing alt attributes on <img>s with an attached file resource. This is categorized with "low priority" lints and and the data for each match lists a link to the image file in a machine-readable way as well as the positional info in the markup for editing.
Feb 12 2024
Hm, this could be an issue with the recent changes to the sandboxing of the ffmpeg process. @Joe can you take a peek? This input file is in av1 which may cause it to use more memory for decoding than other files, perhaps, could be tripping up something.
Feb 9 2024
Here's a 4K video that fits in the previous upload limit but has an estimated bitrate resulting in exceeding the 2GB and 3GB limits:
Feb 8 2024
Since we aggressively cache file metadata I don't expect parsing the whole file to be significantly expensive here. No compressed data has to be decoded, it'd just be seeking through the file during the getid3() call at upload time, which it may well be doing anyway (I haven't checked the code for mkv/webm, but I know that's exactly what the mpeg parser does to get durations)
ffprobe says it has no listed duration o_O in which case i'm not sure if a duration can be read without parsing the whole file, but getid3 happily does that for mp3 etc ;)
This may be another getid3 issue but with the webm/mkv parser, I'll sort through these in a bit and collect a list of affected files to test fixes with.
Feb 7 2024
Looks good except that mime type regression, easy fix in gerrit above ^
Feb 6 2024
Hm, $wgTranscodeBackgroundMemoryLimit is specified in KiB according to the extension.json doc comment, and the default is 2097152 (2 GiB)
Jan 31 2024
Proposed fix appears to be working great in local testing; removes the CODECS marker from *only* the 'jpeg' or 'jpeg,mp4a.6b' sources. This might actually get things working on some older iOS versions than 15 too, I'll see what I've got in my test drawer.
Obtained a suitable test device (iPhone SE first-gen) which runs iOS 15. Confirmed it will play the MJPEG and MP3 back-compatiblity tracks but it doesn't allow them through the codec filtering in the system HLS player if you list them correctly. Listing a false avc1 codec string works but feels wrong; simply leaving out the "CODECS" key in the playlist seems to be enough to let the HLS player try it on iOS 15, and doesn't appear to interfere with iOS 16 or iOS 17.
Jan 24 2024
(cf T344378 for mobile editing microtasks missing alt text planning, we're doing a test spike using linter)
Note that linter enthusiasts may have strong opinions about linter data, but there doesn't appear to be a central place to communicate with them. Before deploying any additional lints that go into this API we should ensure we do some proper community discussion to a) warn people it's coming :D b) consider alternatives or mitigations to a possibly very large influx of one type of lint error, and c) improve these processes for the future, so the next time someone wants to add a regular markup check we can do this more easily :D
CCing myself on this as I'm investigating add'l lints meant for editing microtasks :D
Looking more into the linter internals; the actual lint checks are done inside Parsoid, in the Linter class; there's no extension/hook interface for adding arbitrary lints (there does appear to be a way for tag extensions to register custom handling, but it won't help us here). Adding either an additional check for missing alt text straight into main should be pretty straightforward I think, and then we can draw from the recorded lints on newly edited pages (and do an offline batch run to prefill them)...
Jan 23 2024
[I suspect what's going on is that iOS 15 doesn't like the HLS packaging after all and some of my compatibility reports were incorrect. We'll confirm this once I get more test devices. In the mobile web the JS automatically detects the failure and fails over to the WebAssembly codec; this isn't done in the app at this time. We can either do the same within the app, if necessary for back-compat, or do an alternate fix of some kind. Keeping this on my claim for this week.]
Jan 22 2024
Ok, I'll change it back to the original title though I felt that was not actionable.
Jan 18 2024
A harder, but possibly desirable possibility I mentioned on IRC: we could encode each ~10-second input chunk separately, then stitch them back together on completion. This would allow long-duration and high-resolution files to divide up into multiple chunks that can run simultaneously, each on a limited and predictable max core count.
Another complication on thread count -- the VP9 encoder can only make use of so many threads effectively, based on the size of the frame (controls number of macroblocks that can be simultaneously run) :D So usage is tough to predict from the job type. We *could* implement some kind of variable size thing where we "block up" so a 240p takes 2 cpus and a 2160 uses 8 cpus etc but I don't know if that can be expressed sensibly.
Couple quick notes:
- Reducing thread count is IMO a very bad idea, as most of the time there will be few jobs and they may be high resolution videos. You want to use the maximum number of available threads to keep CPUs occupied or else you're going to waste a lot of time and make the jobs run a lot slower at high resolutions (the slowest jobs). It would be better to sometimes be at full load on a server than to have a single job that takes days instead of hours or hours instead of minutes.
- The task roughly checks out a source file, decides an ffmpeg command line or two, runs them and then stores and processes the resulting output file. There are two ffmpeg passes at present for better bitrate control but if need be a single pass command can be used instead. This may run up to several hours for very large or long videos -- thus this is where we have the potential to split the job into three components:
- regular media wiki job gets the info and builds the command options and sends that to a script or service
- which transcodes the file according to given command settings, without media wiki logic itself and calls back to mediawiki API which queues a
- cleanup job to import the file and update database state
Jan 17 2024
I'm currently blocked on CirrusSearch but I can fix this from the TMH side with a quick hack. We'll bring the underlying internal APIs back up in a bit once I can test my changes. :D
Currently blocked on setting up a local CirrusSearch to drive MediaSearch for local testing. For now I'll restore the original namespacing so the old code should continue to work, but I can't test it.
I'm cleaning this code up right now :D We had a regression due to internal refactoring breaking the way the APIs were referenced. I think I can remove the internals reference and make it cleaner.
Oh wait it's not from there, it's from MediaSearch's invocation. :D Ok I see; we had a regression due to removing OgvJsSupport from the mw namespace in the ES6 refactoring; MediaSearch was directly invoking it and videojs. I'll clean this up on the MediaSearch side.
This in ext.tmh.player.inline.js is failing:
Jan 16 2024
I suspect in 2024 we're actually better off not using an extension; Web Codecs is deployed in Chrome and Safari and can do hardware-accelerated decoding and encoding of video frames and audio data, leaving only demuxing/muxing to the JavaScript side to do. Firefox is also implementing Web Codecs, and should deploy it at some point.
Jan 8 2024
I had some FLAC issue on Firefox in the past, but we can use flac's if it's supported right now (only 16bit, using flac in 8bit is not good because filesizes are going to be same).
caniuse claims same current-version support for both:
just using the wav source file directly, something like this:
(Note: if the proposed fix for this is not to use WAV pass-through or FLAC transcode, this will probably simply never be done and we should just close.)
Just confirming terms. ;) Duration is always about time. File size is pretty much irrelevant; these are niche files with small audiences, and most likely all relevant files will be fairly short.
About resampling: Resampling is just the same thing with slowing down or speeding up something right? I think they are same things but if you think different: Transcodes must not be resampled to don't destroy their structure. Sometimes sampling rates will be incompatible, if that's happened (for example, if samplerate is 10000): Transcodes must be generated in nearest supported rate (11025 in this case) and contain the smilar rawdata. So, the MP3 will be sound like the same speed when it's played on 10000 so a little bit slower.
Ok, changing the title again to clarify what I originally understood was mostly correct: you want a way to avoid resampling, and there is *nothing specific wrong with the transcodes* except that they're resampled/not-lossless in the first place. :)
(I'm wondering also if your .wav files might be themselves incorrect to begin with and you're just not playing them back to check them against the MP3 or Ogg Vorbis transcodes?)
1: They are slowed down extremely, sounds like in the same speed in 44100Hz. Transcodes must not be slowed down.
I cannot reproduce any difference in speed between these three files as played back in VLC or in Chrome:
This task is absolutely not for increasing quality. Transcodes are totally different from sources. They are slowed down (or speeded up) extremely in most of files.
[Note that fuller support for transparent videos would be _neat_ but likely only useful in combination with autoplay animations. We can think about that later, if you're interested in it happening let's open a separate feature request and we'll throw around ideas about how to fall back for non-transparent-video-supporting browsers.]
[old, not currently occurring, ran old items long ago, closing out]
Note also that VP9 transparency is not supported in Safari.
Thanks, I'll download and test those this afternoon. :)
Note if anything is being "slowed down" this is a SERIOUS FUNCTIONAL ERROR THAT SHOULD NOT BE HAPPENING; please report that specific file to me with a link in a separate bug report.
This is a low-priority feature request. It's not a bad request, but it's low priority because it's a cleanup/nice-to-have, not a functional issue. Whether it gets gotten to in the next few weeks or not is based more on how easy it is to implement cleanly without drawbacks than how important it is. :)
This was working last month in tests on git build, I'll double check this week
Should work on iOS 12 and up iirc, I'll have to run more tests
Dec 12 2023
Dec 4 2023
Started bulk re-encoding with the remuxing being used transparently whenever possible. I expect full coverage at at least base resolution by end of calendar year. :D
Nov 21 2023
Closing.
Nov 20 2023
[In general I know we've got some order of operations problems with the transcode table and other db bits, the job queue, and the file storage -- three distinct storage systems which need to be either in sync or in the proper order to handle weird sync issues. I'm going to collect up a few of these related issues today and try to work them out in the next couple days. If we're lucky I'll have something up for code review before the Thanksgiving holiday Thursday! If not I'll keep poking at it later.]
I'll add this to my queue. Currently I'm chasing down bulk encodings; chasing down ongoing errors will be next and I expect several synchronization-related problems with cached data chunks. :)
Nov 19 2023
Script completed. If no surprises during the next couple days of rebuilds, all should be well to resolve. :D
Nov 17 2023
Script is up to the "An"s, I expect it to complete within ~24-48 hours.
Running a one-off rebuild of streaming manifests separately from the active transcodes. This should fix any that were damaged that haven't had a chance to be fixed by other runs yet:
It'll require actually updating one of the other transcodes on an affected file to rebuild the m3u8 to work again... I've got several batches running, but I might be able to force a simple regeneration too.
Nov 15 2023
Fix deployed, these'll get cleaned up as transcodes run.
Nov 13 2023
I think we can just take this bit out safely, it wasn't reliable and the requeue scripts have to handle the case where a row just got left anyway. :D Patch coming...
Oof, yeah that explains it. I'll add it to my pile of fixes for this week. :D
Nov 12 2023
Nothing should be writing to the database from TimedMediaHandler in page views, as far as I know. I think there used to be something years ago but if it's still there it needs to be removed. Is this still a problem that anyone is aware of?
done
this finished :D
This still happens, and it's mildly annoying. ;) We may need to force proper chronology on the database connections used to fetch file data? Anyway, back on the slate for cleanup work :D
Nov 11 2023
I changed my mind and, in recent updates, instead added extra bandwidth for the higher-frame-rate videos so they still compress nicely.
For .mov container input, which is reasonably common on certain generations of cameras, all we should need now is consistent support for ISO BMFF-family input files with proper codec validation (right now we kinda half-ass it). Depending on source files, some may have AAC audio as well (cf T166025).
Hm, maybe I misremember. I'll check this back next week, it still isn't playing in Safari and I'm not sure if it's a cache thing or a breakage thing and I'll deal with it Monday.
I've implemented the squooshing to 2-channel output in recent updates since We Can't Have Nice Things. Re-running transcodes on this file. See T351025 for the more permanent long-term solution of adding surround tracks probably using AAC
Current state of this: