Trying another approach to HDMI-compatible Audio

So, we have had HDMI-compatible video output working for some time. We even have the audio working on many monitors, but not all. There is something fishy with the audio output of the ADV7511 driver, when using certain TVs, monitors and HDMI capture devices. The symptom is either no audio, or no picture at all when audio is enabled.


I've been pulling my hair out over this for a long time now.


To try to circuit-break this situation, we have bought an FPGA development board that doesn't use an ADV chip, but has the FPGA directly drive the video signals. Mike Field and others have done great work on this, to demonstrate that it is quite possible with the Artix7 series FPGAs we are using.


The board we are using for this is a Mimas A7:



This has a smaller FPGA than the MEGA65, and can't run the MEGA65, but it is big enough for me to produce some video and audio, and try it out on the Samsung TV I have here. The Mimas A7 board even has an example HDMI-output project, which I have been able to use and begin to adapt.



But I am still seeing some funny things.


Primarily, if I try to use more than 4 bits of audio resolution, that the HDMI TV fails to get any picture at all. It doesn't seem to matter exactly which four bits. I can even sometimes use more than 4 bits, so long as no more than 4 in a row in the 16-bit sample format are used.


What is REALLY weird, is that if I mask out audio bits with dip switches on the board, that there is no picture, even if I have the audio bits masked out. This just makes absolutely no sense to me, as what comes out the HDMI connector on this board should be completely identical when the bits are masked as when I have them simply tied low -- yet only the latter gives a picture.


Now, I am trying to use a different video mode (720x576x50Hz) to what the example project uses (640x480x60Hz), so it is possible that the HDMI info frames are confusing the TV. by saying one thing, while the display stream has something quite different in it. What I am suspecting is that the audio clock recovery coefficients (N and CTS) will be causing problems. It still doesn't explain why the really weird problem happens, but if I can confirm that I can use all 16 bits of audio resolution in the default mode, that will still be a helpful data point.


So I am trying to switch back to the default 640x480 mode, and then try to get audio working in that mode. However, I am hitting another really weird problem: When I try to workout the set of flags required for that mode (these are some flags that go into the HDMI infoframe), I can make them work if I have them hardwired to the correct values, but if I try to make them settable via the dip switches, they don't change. This is despite the fact that I can read the dip switches, put them into an array of boolean flags, and check them elsewhere in the design, even have their status output continuously on a serial port where I can see them changing.


Yet the TV steadfastly claims that there is no signal, no matter how I twiddle the dipswitches, and see the correct values on the serial output to indicate that they are being set correctly. So then I plumbed some return signals back through, so that I can see if the dip-switch values are getting propagated down into the HDMI video generator, which they are. This is of course very bizarre, since they are clearly being set, and getting where they need to be, but seemingly not affecting the video stream being produced.


And of course this is VHDL and FPGAs, so each "recompile" takes several minutes, which makes for a really frustrating debugging experience, especially when making only tiny changes.


So, after several more waits for resynthesis runs, I have again confirmed that if the EnhancedMode (which enables HDMI info frames) is hard-wired to false (as compared to being held low via a dip-switch mapping !!), then I get a perfectly fine image on the TV. Recall that I have confirmed that the signal in question goes into the HDMI module and comes back out again, confirming that it should darn well be visible to the HDMI module when the dip-switch toggles.


If the dip-switch is in the correct position on power-on, there should in fact be absolutely no difference in behaviour -- despite the opposite happening in practice. The only possible clue here is that seemingly the flags are being treated as always true, when connected to the dip-switch.


Right. So, by adding a counter into the HDMI generator, that counts the number of times that a particular area is entered, it suddenly works.

So bizarre is this, that I even looked at the Vivado logs for both versions for any warnings that might differ between the two -- but there are none.


This does not really give me any encouragement for tracking down the audio problems, since that is also exhibiting similarly nonsensical behaviour. It might just be that I need to update to the latest version of Vivado, and see if if this is a bug that has been fixed in the meantime. In any case, if I want to log an issue with Xilinx, they will want me to try it with the latest version anyway.


Ok. So 16GB of downloads over our poor satellite later, I have Vivado 2019.2 installed in place of the old 2017.4 version, and it pleasingly fixes this weird bug with needing the counter. I'll now carefully and progressively back out the debug stuff I added in, and see if this now also fixes the weirdness I was seeing with not being able to use more than 4 bits of the audio. I'm hopeful, as the two problems feel rather similar in their complete insanity. Indeed my hunch seems to be true: I can now get 8-bit audio (and presumably the full 16-bit available) in 720p50.


Next step is to switch out the sample VGA frame generator for the MEGA65 one, so that I can be sure that I can get audio with that. At that point, I will prepare a bitstream for the team to test in Germany with the monitors that they have there. If it works with every monitor we can throw at it, then we know we have a good solution.


So far so good: I now have a bitstream where a dip-switch controls if it is PAL or NTSC mode, and I still get sound. However, I am not yet convinced that the audio quality is as good as it should be. This could just be my wooden ear, or it could also be that the 36-element Sine table I am using is not that great. So I'll still try to get an audio sample recorded and imported as a little ROM in the test design, so that I can verify that the sound is right, or not.


After all manner of further adventures with Vivado weirdness (or possibly PEBCAC), where changing a single unrelated line of code changed the behaviouor of the whole thing, I finally have audio working. It is possible that the Samsung monitor rejects any HDMI signal that has what it thinks is impossible sound. I'm not sure.


But in any case, I can now push 16-bit stereo audio out the HDMI interface, which is correctly received by the Samsung TV -- which is one step up on what we have been able to achieve with the ADV7511 HDMI chip on the R2 PCB. Indeed, I've been able to reach this point in MUCH less time than the whole fiddling with the HDMI chip took. I'd still like to know whatever it is that the ADV7511 is doing that the Samsung TVs don't like. But we don't really have the time to investigate that right now.


So now it's time to package up that bitstream for the others in the team to try on one of these FPGA boards with all the monitors that they have on hand there. If that all goes well, we will have a solution that is probably a few Euros cheaper per machine. That will help us to either keep the cost down, or, possibly by the magic of modern economics, to include a nice secret surprise on the production MEGA65 PCBs...


Anyway, I'm glad to finally have HDMI with audio working on the Samsung, and "ich drucke mich die Daumen" (I'm crossing my fingers) that our German half of the team will have success with the monitor testing on their side. I'll also see what random selection of HDMI-compatible displays we have lurking around the other buildings here over the next few days, and test those, too.