The NTTW Effect: Dispatches from Budapest

The dust has settled following another brilliant No Time To Wait (NTTW) conference which took place during a snowy week in Budapest, scheduled to coincide with the 17th birthday of Matroska on December 6th. This is the fourth year the conference has taken place, launching in Berlin in 2016, 2017 in Vienna, and 2018 at the BFI in London – where I first attended. The week of NTTW4 found me temporarily unemployed having just left the Media Archive for Central England, and on the cusp of joining the BFI as a Digital Preservation Data Specialist. In addition, I had the great honour of hosting two roundtables so NTTW4 was a particularly auspicious one for me.

www.vegeldaniel.com
Goldberger House, Arany János Street, Budapest. Photo Credit: Vera and Donald Blinken Open Society Archives

It was hosted at the Vera and Donald Blinken Open Society Archives, on Arány Janos Street. Amongst their collection they hold documents covering recent history, the Cold War and International human rights violations. The archives rightfully pride themselves on being “a laboratory of archival experiments on new ways of assessing, contextualizing, presenting, and making use of archival documents”. Their projects include Visualizing Human Rights Data using contemporary visualisation tools, a collaborative research environment called the Parallel Archive, Mining History which allows dissemination of new data analytics in the evolving digital world, and many more.

DaveRice.jpg
NTTW4 and the transparency roof light. Photo credit: Dave Rice

The conference was staged in the heart of this collection, a perfect location for an open source and open standards conference with a drive to develop further tools and resources within this collaborative and experimental av preservation and developer community. Set high above the conference seating was a glass roof representative of a greater need for transparency. At times it was a little too bright for the projected images, but forms a wonderful metaphor for the open and transparent nature of this NTTW community.

I’m disappointed to say I didn’t make it to the two hack days that happened ahead of the conference due to my bad planning and unreliable transfers. As such I can’t provide any feedback but I heard many positive comments about the usefulness of these sessions. One individual (sorry I can’t recall who) suggested they follow on from the conference in the future, allowing for conversations to spill over into hack based group activities building on conference conversations – which seems like a great idea!

Following I’ve picked out a few of my favourites from the two days, but all of the conference presentations are available to view at Media Area’s YouTube channel. Some of the excellent presentations were a little too complicated for me to grasp quickly, so please don’t take my list below as a definitive guide to the best.  I’ve included a link to each video, just click the title.

 

Day One

Derailing Best Practices by Dave Rice
I love to hear Dave speak, and this was no exception. Dave introduced early career experiences of his over-reliance on best practise and times when this was challenged by colleagues. This led him to reconsider these practises and how they might be considered to have expiration dates. It’s a really interesting point that we don’t give these processes anything like the same time and attention that we give to the actual files and objects that we care for: “There’s not time to wait for the chance to be the consumer of a miracle solution or product. We need to be constantly nit picking, justifying, contributing and sharing our solutions” Dave Rice.

Migrating ProRes/MOV to FFV1/MKV by Kieran O’Leary
This lighting talk presented findings from IFI Irish Film Archive tests to migrate lossy file formats to lossless FFV1. But why would you want to move a collection to FFV1 from ProRes? I asked this question myself last year, as there was a strong desire to move away from proprietary format like ProRes to safeguard MACE’s collection for long-term storage. Kieran draws on his team’s experiences to reflect on how you might analyse such encoding experiments, particularly with a mind to finding problems with what may seem a logical decision for some. It was interesting to discover holes in FFmpeg’s metadata handling discovered when trying to migrate QuickTime specific data such as clear aperture (clap) atom information to the FFV1 Matroska file. The loss of this information can result in irregular playback in various media players. This example is an excellent illustration of how something as simple as a ProRes QuickTime specific codec can cause the most basic normalisation problems within an active archival workflow – and that may go unnoticed without analysis of before and after.

Screen Shot 2020-01-04 at 14.29.17

The final, and I think most important point, was how to deal with these kinds of problems because solutions are always possible. Developers need approaching to make changes and funds need to be found to cover these activities. Changes are necessary to support FFmpeg use in archiving tools. He draws on IFI’s financial contributions to FFmpeg and RAWcooked and the processes Kieran followed to make these changes happen, using online bug trackers and direct correspsondance with developers. I think this short talk provided a really important overview of a successful test workflow that focuses on problem solving and finding solutions to issues. More importantly it shines a light on good practise that we can all build on. Maybe by collaborating on analysis of collections and practises, we can find funds collectively to fix issues for the benefit of the many.

State of Matroska / Videolan VLC 4.0 by Steve Lhomme
I felt like this was Steve’s year at NTTW! The event was scheduled to coincide with the 17th birthday of Matroska, celebrating the fateful day when he created a GitHub fork from the MCF/TMF format – December 6th 2002.  Not only is he the face of Matroska but a key developer on the VLC media player project. On day one of the conference he presented twice with detailed introductions to both, and updates for both of these projects. I would recommend watching them if you use either of these!

FFmpeg: A universal media encoding tool by Carl Eugen Hoyos
This is a really nice introduction to FFmpeg and the nested tools that come when you install it – FFplay and FFprobe.  Carl introduces the eight internal libraries that install with FFmpeg – libavcodec, libavformat, libavfilter, libavdevice, libswscale, libswresample, libpostproc and libavutil. He further discusses the functionality of some of these libraries giving an overview of active developments occurring right now and codec/formats supporting both new (including details of FFmpeg support for AV1), historical and obsolete audiovisual video and audio files. I was amazed to hear there are over 250 filters that now belong to the FFmpeg family (many churned out by Paul B Mahol it would seem) with Screen Shot 2020-01-04 at 21.37.28support for scale, crop, overlay, hue, deinterlacing, inverse telecine and framerate interpolation. He closed with a few statements about what FFmpeg is not, particularly mindful of the most frequently questions directed at him by archivists.

The RAWcooked Project by Jérôme Martinez
When reviewing NTTW videos via the MediaArea YouTube channel I find it fascinating to map the development of this project. In 2016 Reto Kromer and Kieran O’Leary presented their fledgling proof of concept using FFV1 codec to store DPX image sequences via FFmpeg, then in 2017 Jérôme introduced RAWcooked developments requesting sponsors to move the project beyond it’s development stages. Since then the product has grown in strength with the first official version being launched for NNTW3. In this presentation he introduces RAWcooked and the fundamentals of how it operates and some of the huge benefits of encoding image sequences (TIFF or DPX) to the FFV1 video/FLAC audio codecs. He compares ZIP or TAR compression to FFV1 compression, highlighting the inherent benefits of FFV1’s slice and checksum features, but further explains issues surrounding metadata transfer to FFV1 and how these details are stored in the Matroska wrapper so an image sequence can be fully reversible. There loads more in this video including plans for a GUI version in the future…

Learning by doing: digitization of cold war and human rights collections using FFmpeg-based solutions by
Zsuzsa Zádori and Darius Krolikowski |János Dani | József Bóné
This fascinating group of presentations is broken into three sections introduced in turn by each of the presenters. Zsuzsa begins by introducing the OSA’s collection, focusing on their video media and their preparations for a mass digitisation project in 2017. They invited Dave Rice to consult and advise them on ways to achieve their goals quickly realising the best route for their archive was a microservice infrastructure. Next AV archivist Darius demonstrated the chain of equipment used in video tape capture workflows – before and after Dave – expanding on some problems they encountered with time based correctors in their S-VHS capture workflow. Screen Shot 2020-01-13 at 15.16.31The changes they implemented saw them working with FFmpeg and QCtools.  He made an appeal for archives to share knowledge and possibly even access to ‘open hardware’, so that blueprints, service manuals or error code lists can be made available where they may be missing or hard to source.

Next up OSA help desk representative János discusses the implementation of open source software Airflow, chosen for its clear visual monitoring, alert and email notifications, charts and inbuilt error checking. Airflow is Python based, and originally developed for AirBnB. It has a large and active community support network and allows easy WebUI management of complex microservice workflows. To expand on these workflows, or Directed Acyclic Graphs (dag), Screen Shot 2020-01-13 at 15.32.19Head of IT József presented one such dag to the NTTW crowd (see right) documenting the preservation workflow for video files. The dags are used to collect and structure tasks you want to run, organised in a way that reflects their relationships and dependencies.

The Airflow processes are operated from within Docker containers, and another Docker contains all the FFmpeg processes for this mass digitisation project.  The project has a blend of bash scripts and Python scripts working together, all of which can be easily used within the Docker container and Airflow workflows or separately for solo passes. This was a wonderful indication of a busy microservice workflow being managed beautifully and achieving amazing video tape capture results. Questions afterwards saw Zsuzsa asked how many more staff were needed to meet this change in workflow structure, to which she answered none. This model should be a great inspiration to microservice archives around the world! Find out more about their scripts at the Blinken OSA GitHub.

Adventures in reading FFmpeg logs by Ashley Blewer
I’ve been so excited for this talk since I first heard about it. I’ve spent hours staring blankly at FFmpeg logs wishing I had some way to penetrate the confusing strings of data to extract value from what I know is very important information.  Ashley’s aim was to help beginners new to command line and FFmpeg, and starts by breaking the talk into sections “What is FFmpeg” (see Carl’s talk above), “What is video”, “What does FFmpeg think a video is?”, “Reading logs, a primer” and then breaks down the log files in way that exposes the inner workings of this remarkable media encoder/decoder by using examples of when it goes well and when things break. Screen Shot 2020-01-04 at 22.59.40Ashley’s talk helpfully included some FFmpeg commands and introduces people to FFmprovisr and the collection of FFmpeg commands designed for archival use. Core teachings for log analysis include: read logs from the bottom up; look for the word ‘Error’; colours are clues; and maybe try searching the source code of FFmpeg to locate the particular error phrase. Finally, Ashley runs through error examples and explains the causes in detail. This is one of my absolute favourite all time NTTW presentations, that even came with Simpsons GIFs to entertain any bored FFmpeg developers in the audience – like they would be!

Supporting Archivist Practitioners Roundtable
Joanna White, Bryce Roe, Ashley Brewer and Ben Turkus

It was an enormous honour to chair this roundtable with an amazing group of Archivist Practitioners, educators and vendor/educators. It was incredibly daunting for me, but I wholeheartedly thank the NTTW organisers for this amazing opportunity. Our panel aimed to confront issues faced by Archival Practitioners, their constant search for up-to-date information, the challenges presented by organisational conditions, physical isolation, and pressure to determine what information is most pertinent at any time.

Screen Shot 2020-01-04 at 23.26.31
Support Archivist Developer roundtable with added white wine! From the left: Bryce, myself, Ashley and Ben. Photo credit: Media Area

Our questions led to conversations about gatekeepers of knowledge such as institutions or management with blind trust in vendors, to the complications faced by attempting to train archivists with diverse skills and levels of understanding across myriad audiovisual topics. We introduced some of the audience to the concept of imposter syndrome, frequently experienced by archivists who struggle to stay abreast of ever changing digital processes, and we contemplated how we can encourage archivists already burdened with deadlines and massive workflows to engage in developing their skills through further education. Sincere thanks to Bryce, Ashley and especially Ben (who stepped in last minute) for their insightful experience and the confidence they exuded. Forgive me my nerves but the wine Dave provided (in reference to USA’s Today Show) made things flow more easily!

Day Two

Questionable File Show and Tell by Julia Kim
This presentation seemed to break new boundaries for NTTW! Julia presented various weird born digital and digitised analogue anomalies displayed in audiovisual file examples alongside metadata extracts and opened the issues to the audience to feedback on the use case. These were sourced from AV Artefact Atlas, anonymous submissions and archivist submissions. The issues included moov atom anomalies, weird truncated frozen born digital files, NTSC and PAL metadata mash up, missing bit information in 10-bit tape captures, and a crazy 2005 file that can’t decide if it’s 4×3 or 16×9. Screen Shot 2020-01-12 at 22.38.59Audience responses came from FFmpeg developers, video forensic specialists, video tape specialists, developer archivist professionals and open source software developers. I mean, what a diverse and well informed audience with some amazing feedback! I really hope this presentation becomes an annual event – the learning possibilities for Archivist Practitioners are really thrilling.  A final excellent addition came from Yvonne Ng from Witness, who made an appeal to apply a show and tell methodology to human rights evidence, asking how can the open source community contribute to validating the authenticity of evidence based media? Bring along your weird files to the Netherlands and lets make this a regular show and tell for expert feedback!

A Matrix for Video Game Collection by Caylin Smith and Stephen McConnachie

A lightening talk from representatives of a newly formed network of collections looking to ‘gang up’ and face the issues presented by archiving video games. The consortium includes the BFI, Cambridge University Library, British Library Tate, V&A, Science Museum Group, Wellcome Collection, Museum of London, British Games Institute, National Video Game Museum. The consortium are handling this enormous challenge by splitting into three subgroups that include Preservation and Access, Advocacy and Strategy. The presentation draws on just one example, ‘Hellblade’ by Ninja Theory, and they analyse the challenges of just Preservation and Access. It’s a slightly mind blowing challenge when you consider the distributed nature of many video games worlds that include developer diaries, various parallel merchandise ephemera, and the mass produced fan related video outputs such as gameplay videos! What do you catalogue and where do you draw the line? Really interesting to have this presentation at NTTW and can’t wait to see how the consortium progresses.

Backup and Restore of IR Remote Controls by Peter B
Peter may be better know for his excellent contributions to FFV1 codec development. See his lightening presentation on Day One, Presets for FFV1 and Matroska, and many previous NTTW presentations about this amazing codec. This year he has branched out into backing up/emulating infra-red remote controls using the ever versatile Raspberry Pi.  His institution purchased some tape decks that came without remote controllers. This sparked a memory from his youth of using Linux to build an IR receiver which recorded text files from IR signals, the rest is history!  This fascinating talk covers his failures and successes and breaks down how he finally achieved his goal of IR remote control emulation. Screen Shot 2020-01-12 at 22.40.35He used a Raspberry Pi to receive the signal then built a prototype sender to emulate. He shows the text files that form the codes for each remote button, and how easy it is to edit and replicate them. He further encourages the archiving community to archive and share their remote controls as text files! A really fascinating talk – and I sincerely hope Peter has time for more Raspberry Pi based adventures in months/years to come as they could provide excellent budget open source solutions for archives internationally.

Opening Closed Captions by Annie Schweikert and Ben Turkus
This is a powerful presentation that starts with a really great intro into what Captions are from Annie, and why it’s a good idea to access them. Captions are digital traces hidden inside an analogue video signal revealed by tape digitisation, liberating data that contains crucial text-searchable information about it’s content. Annie explains the history of closed caption development, and how in 1980 the first prerecorded media including closed captions was broadcast. There could be four decades worth of video tape captions waiting to be set free in any of our archival collections, so this is an issue we should be thinking about! In Annie’s own words: “Figuring out how to extract closed captions is not just saving us time and duplicated effort, but it’s also honouring a sustained advocacy that gave us captions”. Through combined research efforts and harnessing an FFmpeg filter has been created by Paul Mahol that extracts EIA-608, an open source bash script has been developed by Dave Rice called ‘sccyou’.

Screen Shot 2020-01-12 at 17.47.03Ben discusses sccyou’s development and operation starting with the amiaopensource sccyou GitHub and delves into Dave’s script and what it does. This script also allows you to view back a file using FFplay with your captions overlaid. Ben makes a strong case for open source scripts like this and the deeper insight you gain into the wonderful wealth of data extraction tools from FFmpeg (as seen in the #trashcaptions example shown above). He wraps up with a few development ideas, such as incorporating sccyou into vrecord, handling of multiple captions tracks, refinement of testing, development of QC procedures and better reporting on trash captions.

A Forensic Approach to Video by Gareth Harbord
Data recovery specialist, Gareth Harbord, brought scientific principles from the Digital Video Forensics Lab of the Metropolitan Police to the NTTW community. His experience includes recovering physical and optical media necessary for Court Room appearances. This includes enhancement! Not quite CSI Miama level, but he demonstrates the benefits of using tools like Blind Deconvolution and Frame Averaging to clean up an image just a little bit. His biggest problem comes from diversity of sources and handling massively variable file formats from CCTV systems, IP cameras, Network Video Records, Dashboard Cameras, Digital Video Recorders, and mobile devices. He discussed the problem of enhancing mobile video material when many phones don’t support colour range metadata correctly, plus problems receiving such videos via social media platforms and the additional compression horror this introduces.

The most interesting section of his presentation for me is his use of a hex reader, and examples of various markers he looks for when trying to identify a video file.  He provides loads of examples of hex information from a frame showing size, Screen Shot 2020-01-12 at 18.58.42camera numbers, Epoch start time and end time, atom size, width and height, codec information and many more. This hex demo ends with an amazing explosion (shown right) from an AVI header. By fixing the file size and total number of frames in a hex editor he was able to recover this AVI to a useable state. And the MET Forensics department are big users and fans of our archival open source tools including MediaInfo, FFprobe and ExifTool!

Knowledge Obsolescence in Audiovisual Preservation by Brianna Toth
Brianna’s presentation is a development of her graduate research focusing on collaborations between institutions to avoid issues of obsolescence, particularly technical knowledge to repair and maintain analogue video playback equipment. This ‘Magnetic Media Crisis’ has approximately 10-15 years to play out when the majority of magnetic media will be unplayable. Brianna compiled a list of instances starting in 1978 relating to knowledge obsolescence and analogue video decks. She collaboratively made a great visualisation in the form of a colour coded timeline breaking the instances into projects, publications, reports, conferences, task forces and grants – you can submit your own instances too! She concludes with some steps to potentially combat this knowledge obsolescence and degralescence by encouraging mentorships or bottom up strategies that include large scale collaborative projects.  There’s an AV Knowledge Obsolescence survey Brianna has generated and is well worth your time to complete.

The ‘NTTW Effect’ roundtable
Joanna White, Steve Lhomme, Kieran O’Leary and Jonáš Svatoš
It was a massive honour to share the stage with Steve Lhomme, Kieran O’Leary and Jonáš Svantoš, all of whom have been involved with NTTW since it’s inception. I felt a little bit of an imposter alongside these talented professionals! Together we discussed our experiences of NTTW, and what the NTTW Effect represented to each of us. For Kieran it was an accelerated development in his skills following the first No Time To Wait and exposure to the powerful combination of developers, archivists and standards adherence. Jonáš was impressed by the easy mix of professional developers and new comers like himself, making it easier for a socially awkward archivists to collaborate with others – so he no longer had to search in the darkness for answers. Steve reflected on the difference between NTTW1 and NTTW4 for uptake of Matroska attributing it to this conference and the excellent publicising from Dave, Jérôme and Ashley.Screen Shot 2020-01-12 at 21.24.17 It was a wonderful moment to hear Steve declare that he learns as much from the community as anyone else, admitting to Kieran that he hadn’t know what fixity meant despite the fact it was already built into Matroska! It’s so encouraging and inspiring to hear this kind of honesty from people like Steve who contribute so much to daily digital preservation workflows of so many around the world.

We spent a little time reflecting on the reasons for NTTW’s launch initially in response to the PREFORMA project, sharing open source formats and tools, and the race to migrate video tape collections. Over the years it’s become less FFV1 and Matroska focused and more about open workflows in response to the intersections between developers, audiovisual archivists and open standards working groups – and the conference is definitely impacting on broader international practises. But, particularly in light of Brianna’s previous presentation we should still be scared by the obsolescence of video tape – there are many collections in dire need of attention and there really is no time to wait! And there’s new work to complete year on year in response to developments made by the NTTW community, Steve giving the example of finding ways to store the new additions of closed caption files, and timecodes in the Matroska wrapper. Finally we all drew on our personal highlights, from getting to know the legend that is Carl Eugene Hoyos, and standardisation of formats like FFV1, to staying at Jérôme’s house in Berlin and feeling part of an extended NTTW family.

I’m delighted to say that this panel was conceived following conversations about the overwhelming impact attending NTTW has on its participants. Pre-NTTW I was feeling conquered by the disparate complexities of audiovisual preservation. I’d started using FFmpeg and tested FFV1 a bit, but hadn’t the support I needed to know how to implement them within an archive. In truth, I was interviewing for non-archival jobs.  The assistance and encouragement I received this past year from Dave Rice, Kieran O’Leary, Stephen McConnachie, Ashley Blewer and many others saved my career. It’s invigorated it, generating endless enthusiasm which has enabled me to achieve significant open source workflow developments at MACE, and propelled me into a new RAWcooking role for the prestigious BFI. It’s a constant delight to belong to a group flushed with so much talent, so generously shared between one another. In return for this support I’ve felt driven to share my fledgling steps to encourage others to follow suit – and to honour the efforts of this community by supporting their remarkable outputs. This experience forms my understanding of what the ‘NTTW Effect’ is to me.

With deepest thanks to Jérôme Martinez and MediaArea, to Dave and all the wonderful organising team behind No Time To Wait, the Open Society Archives, to the sponsors who make it happen each year, to the wonderful panellists who filled me with such confidence to host these roundtables, and to all the community for being such amazing and inspiring people.

You asked me: “What am I?” “A child of the people,” I said.
“Rooted in it, I live by them and for them, belong
to them only – by their fate my own is bred.
And if I burst into song, my lips give home to their songs.”
Extract from the poem ‘Reply to Petőfi’, by Arany János (1847)

One thought on “The NTTW Effect: Dispatches from Budapest

Leave a comment