It's quite easy to understand. The tech industry has gone through 4-5 generations of obsolete NPU hardware that was dead-on-arrival. Meanwhile, there are still GPUs from 2014-2016 that run CUDA and are more power efficient than the NPUs.
The industry has to copy CUDA, or give up and focus on raster. ASIC solutions are a snipe chase, not to mention small and slow.
We've seen the same kinds of discourse arrive here as is common on other social media sites, where too much political discourse is just signaling what tribe you belong to and vilifying anyone outside it.
Don't forget about France, Germany, Poland, Hungary, Italy, Slovakia, Netherlands, Chile, Argentina, or Honduras. Right-wing authoritarianism is not limited to the anglosphere.
Photon microGUI was included in that, and it blew my mind that you could literally kill and restart Photon without disturbing any of the GUI apps that were still running.
They also mailed a manual along with the demo disk, and I was amazed that QNX had built-in network bonding, amongst lots of other neat features. At the the time I was using Slackware & the linux kernel version was still 1.x, I don't think bonding came to linux until 2.x?
A lot of the responses seem to be blaming the user/learner and requiring them to change their mindset/attitude, which is actually an insane take.
As you pointed out, SRS isn't the full solution.
BTW, I would say that language classes often try to maintain a constant level of difficulty, but there is usually some kind of coverage of the previous material too.
Pretty good, I've noticed the animation tends to veer off / hallucinate quite a lot near the end. It is clear that the model is not maintaining any awareness of the first image. I wonder if there's a way to keep the original model in the context, or add original image back in at the half way mark.
Thank you. I've noticed that too, and also that it has a tendency to introduce garbled text when not given a prompt (or a short one).
This is using the default parameters for the ComfyUI workflow (including a negative prompt written in Chinese), so there is a lot of room for adjustments.
I think the main reason is that the model has a lot of training material with Chinese text in it (I'm assuming, since the research group who released it is from China), but having the negative prompt in Chinese might also play a role.
What I've found interesting so far is that sometimes the image plays a big part in the final video, but other times it gets discarded almost immediately after the first few frames. It really depends on the prompt, so prompt engineering is (at least for this model) even more important than I expected. I'm now thinking of adding a 'system' positive prompt and appending the user prompt to it.
Would be interesting to see how much a good "system"/server-side prompt could improve things. I noticed some animations kept the same sketch style even without specifying that in the prompt.
I'm actually trying to reduce the 'funkyness', initially the idea was to start from a child's sketch and bring it to life (so kids can safely use it as part of an exhibit at an art festival) :)
There's a world of possibilities though, I hadn't even thought of combining color channels.
I think they were suggesting that it might be possible to inject the initial sketch into every image/frame such that the model will see it but not the end user. Like a form of steganography which might potentially improve the ability of the model to match the original style of the sketch.
reply