This is the first week since the Christmas break since we have seen AI progress this much. Or as the headlines all say, "It was an insane week in AI... Again."
Google dropped two multi modal models this week. One was disappointing, the second is much better. The Ultra version has a 1,000,000 token buffer. And it is faster. There are already plans to goto a 10,000,000 token buffer. It can watch a move and find the scene you described, or sketched for it.
OpenAI dropped a text to video model that blows away all other text to video generators. It is amazing. It can generate 60 second videos that are very stable and even the people in the background are realistic and moving in a realistic way.
Open source models are also still improving, Stable AI released Stable Cascade which does text extra well, and the model works differently than other models in the past. It can't be used for commercial purposes yet because it is still in beta. A lot of people are using it to make logos with text right now.
Stable diffusion Turbo models continue to improve. About half of the images I have been posting lately have been done in just 4 steps, I generated over 900 images this week on my cpu only Ryzen 7 minicomputer using Turbo models.
Someone called lllyasviel also released a new version of automatic1111 that is called Forge UI. This is a GPU only release. It doesn't do much for high end cards, but for people that have 6GB GPU cards they can see upto a 75% increase in speed. It also uses a lot less GPU memory allowing you to do a 4x batch or upto a 3x increase in image size. It also speeds up 4GB cards by 30% and it will allow you to run SDXL on the 4GB or 6GB GPUs. That is a big deal.
On the open source text front, people are still figuring out how to finetune the open source mixtral models. Seeing progress on that front. Once they figure it all out there should be some amazing models released. I also checked out a lot of smaller models and found a really good 3B parameter model that works great on my intel mac with just 8GB of RAM. It should even run on high end phones. Check out stablelm-zephyr-3b.
I made progress on my electronics projects. I am at the point where I want to start implementing a web ui. I have plans on also integrating a small neural net library. I also want to build a simple weather station with a soil moisture sensor. My threading library is kicking ass. I build my sketches for new hardware as a simple example sketch. Once it is working I can pull that sketch into my overall program with scheduled tasks in just a few minutes. That platform is only calling 12 functions a second on one cpu. This is just a fraction of the processing available on a raspberry pico. It is working well enough so far that it could be used as a framework to replace all the IoT crap hardware that exists in the world as an open source project.
No comments:
Post a Comment