Rock-Legende Ozzy Osbourne stirbt mit 76 Jahren de.rt.com/international/251368… Der britische Rockmusiker Ozzy Osbourne ist im Alter von 76 Jahren verstorben. Das hat seine Familie am Dienstag bekanntgegeben. #news #press

Let’s end the age of plastic! - Greenpeace International


Please sign and share this petition. Thank you!

greenpeace.org/international/a…

Win11 นี่จะ take ss แบบจริงๆ แล้วสินะ เพิ่งรู้สึกว่าจริงแท้ก็เพราะ Brave บอกว่าจะบล็อกสิ่งนี้โดย default นี่แหละ คิดว่า #EndOf10 อาจจะไม่พอ ต้อง #EndOf11 ด้วยมั้ง เห้อ ลำบากก็ตรงที่ทำงาน

True minimalism!!! I definitely have to try it 🤣

"Mwm: an X11 window manager in 20 lines of code

Is KDE too much for you? GNOME tries to do too much? Xfce still a bit too fancy? Do you need something smaller? Even more minimalist? What about a mere 20 lines of code which provide the absolute barest possible minimum of window management functionality?

You need mwm."

#mvm #windowmanager #unix #linux #bsd #unixITA #linuxITA #guuf #guufITA #guufxmmp #fedilug

osnews.com/story/142853/mwm-an…

ตัวอย่างฉบับยาว Anime Movie "Fushigi no Kuni de Alice to -Dive in Wonderland-" โดย P.A. Works ดัดแปลงจากวรรณกรรมเยาวชน Alice in Wonderland ผลงานของ Lewis Carroll เข้าฉายในญี่ปุ่น 29 ส.ค. 2025

Zukan เพลงประกอบโดย SEKAI NO OWARI

#fushigialice

ถอดความสำเร็จ “PIM 24” โรงพิมพ์และทีมผลิตเบื้องหลังความสำเร็จของ SME ทั่วประเทศ ครบเครื่องเรื่องงานพิมพ์และผลิตอุปกรณ์ออกบูธแบบ One-Stop Service https://digitalmore.co/ถอดความสำเร็จ-pim-24-โรงพิม/

Pokémon Legends: Z-A สัมผัสประสบการณ์เกมส์ต่อสู้รูปแบบใหม่ “แบตเทิล” ✕ “แอ็กชัน” พร้อมเปิดให้สั่งจองล่วงหน้า รับของแถมสุดพิเศษ! https://digitalmore.co/pokemon-legends-z-a-สัมผัสประสบการณ์เกม/

ตัวอย่าง "Aikatsu! x PriPara THE MOVIE -Deai no Kiseki!-" โดย BN Pictures เข้าฉายในโรงภาพยนตร์ที่ญี่ปุ่น 10 ต.ค. 2025 Anime Movie เรื่องนี้เป็นโครงการฉลองครบ 10 ปี ของทั้งสองแฟรนไชส์

#aikatsupripara

โตเกียวสกายทรี จัดงานฉลองครบรอบ 30 ปี “ทอย สตอรี่” https://digitalmore.co/โตเกียวสกายทรี-จัดงานฉล/

Lee Kuan Yew School of Public Policy Releases Strategic Roadmap for ASEAN’s 5G-AI Transformation digitalmore.co/lee-kuan-yew-sc…

Lee Kuan Yew School of Public Policy เผยแผนยุทธศาสตร์พลิกโฉมอาเซียนด้วย 5G และ AI https://digitalmore.co/lee-kuan-yew-school-of-public-policy-เผยแผนยุทธศาสตร์พล/

ตัวอย่าง TV Anime "Wandance" (Get Something Out of Your Mind เพลงประกอบโดย Yaffle feat. Yujin Aramaki) โดย Madhouse x Cyclone Graphics เริ่มออกอากาศภายในเดือน ต.ค. 2025

- Koki Uchiyama➠Kaboku Kotani
- Hina Yomiya➠Hikari Wanda

#wandance

ตัวอย่างใหม่ TV Anime "Chanto Suenai Kyuuketsuki-chan" โดย feel. เริ่มออกอากาศภายในเดือน ต.ค. 2025

OP: Seishun no Silhouette by H△G
ED: Senkou Hanabi by H△G & Minami Tanaka

นักพากย์เพิ่มเติม
- M.A.O➠Eiko Sakuma
- Ikumi Hasegawa➠Misa Kusunoki

#chanto

DIY Website Tools vs $99 Custom Builds – Which Is Faster and Less Risky?


Neither DIY nor a $99 custom build is perfect. DIY can drain your time if you’re inexperienced. Cheap web design can save effort but risks shoddy work or unreliability.

On a tight budget for your website? You’re likely weighing two options: a DIY website builder or hiring a cheap web designer for around $99. Both promise a low-cost way to get online, but which is quicker? Which is safer? And what do you actually get? Let’s break it down.

The DIY Approach: Drag-and-Drop Freedom


Platforms like Wix, Shopify, or WordPress.com make DIY websites seem like a breeze. Choose a template, drag elements, add your content, and you’re live. No coding skills needed. For a basic site, you could be done in a few hours. But things get complicated fast.

Go beyond a simple layout, and you’ll hit roadblocks:

  • Why does my site look broken on phones?
  • Why is this image blurry?
  • How do I make my pages load faster?
  • Why is this menu not working?

What started as a quick task can turn into weeks of troubleshooting, searching forums, and watching tutorials. DIY is only fast for simple sites—and if something breaks, you’re the one fixing it.

The $99 Custom Solution


A cheap web design offer for $99 or slightly more sounds tempting. You’ll find these deals from freelancers or new designers on platforms like Upwork or Fiverr. They promise a site in days, often using templates and your provided text or images.

But $99 doesn’t stretch far. It might get you a basic site, but don’t expect revisions, ongoing help, or custom features. Some designers deliver decent work; others reuse generic templates or disappear after payment. If you’re clear about your needs and choose wisely, this can be faster than DIY—but it’s a gamble.

Which Is Faster?


It depends on your skills. If you’ve built sites before, DIY might be quicker. You know how to tweak templates, fix layouts, and handle uploads. A simple site could take a day or two.

If you’re new to this, DIY can eat up time. You’ll wrestle with tools, resize images, and search for fixes online. A cheap web designer, if reliable, could deliver in a few days. But low prices often mean no revisions or extra support, so be precise about what you want.

Which Is Safer?


Neither is risk-free. DIY gives you control, but that means you’re responsible for mistakes. Most platforms don’t automatically handle SEO, security settings, or mobile optimization. Errors can hurt your site’s performance or visibility without you noticing.

A cheap web designer isn’t much safer. Some use outdated templates or free tools that break easily. Others might vanish mid-project. Your safety depends on their skill and your ability to pick a good one.

The Real Price Tag


Cheap web design—whether DIY or hired—has hidden costs. You might miss out on:

  • Custom features like contact forms or e-commerce
  • Proper SEO (titles, descriptions, image optimization)
  • Mobile-friendly design
  • Ongoing maintenance or updates
  • Site testing for speed or bugs

These gaps can lead to bigger expenses later, whether it’s time spent fixing issues or paying for a full rebuild.

Who Should Go DIY?


DIY is ideal if you’re tech-savvy, have time to learn, and need a simple site—like a portfolio or personal page. It’s also a great way to build skills for future projects. Just expect a learning curve and some frustration along the way.

Who Should Choose a $99 Designer?


If you’re in a hurry, avoid tech, or need a site fast, a cheap web designer might be better. To reduce risks, ensure they:

  • Provide a clear timeline
  • Show examples of past work
  • List what’s included (and excluded)
  • Give you full control of the site

If they’re vague or unprofessional, keep looking.

Final Take


Neither DIY nor a $99 custom build is perfect. DIY can drain your time if you’re inexperienced. Cheap web design can save effort but risks shoddy work or unreliability. To succeed, keep your project simple, set clear goals, and don’t expect miracles on a tiny budget. Getting online is one thing; staying online without stress is another.

5 ร้านยางมัดผมสุดคิ้วท์ น่ารักจนใจละลาย มัดแล้วดูดี ใช้ได้ทุกวันไม่มีเบื่อ https://digitalmore.co/5-ร้านยางมัดผมสุดคิ้วท์/

Steam gaming finally comes to RISC-V
AAA titles like The Witcher 3 and Crysis now playable thanks to revamped emulation tool

The essential underlying magic is supplied by the recently updated Felix86 emulator project.

tomshardware.com/pc-components…
#RiscV

Create a REST API for the Microsoft/BitNet B1.58 model and integrate it with an Open WebUI


Now I'm writing/semi-complaining lol. Normally I use models from Ollama, and I happened to come across this person's X (Twitter). They have a Microsoft model that people say can run on CPU just fine, and if you use something like M2, it'll be even faster.

Microsoft just a 1-bit LLM with 2B parameters that can run on CPUs like Apple M2.

BitNet b1.58 2B4T outperforms fp LLaMA 3.2 1B while using only 0.4GB memory versus 2GB and processes tokens 40% faster.

100% opensource. pic.twitter.com/kTeqTs6PHd

— Shubham Saboo (@Saboo_Shubham_) April 18, 2025


And that model is microsoft/BitNet b1.58 2B4T. After seeing the news in APR-2025, I waited to see if anyone would try to implement it in Ollama. I saw people asking about it too, but there was still no update.

I waited for a long time until June-2025 and still nothing. Oh well, let me find a way to run it myself from the code then. Initially, I set a simple goal: put the model in a container and find something that can create an endpoint to work with Open WebUI (a web interface for chatting like ChatGPT), So This blog is to document the my experience for this experiment.

I you want to read Thai Version (อ่านได้ที่นี่)


Table of Contents


Getting Ready to Run microsoft/BitNet

- Linux (I actually tried this in Docker)


I take a Python image and install according to these instructions:
# Use official Python 3.12 imageFROM python:3.12-slim# Install system dependencies for PyTorch and build toolsRUN apt-get update && \ apt-get install -y --no-install-recommends \ build-essential \ cmake \ git \ curl \ ca-certificates \ libopenblas-dev \ libomp-dev \ libssl-dev \ libffi-dev \ wget \ && rm -rf /var/lib/apt/lists/*# (Optional) Set a working directoryWORKDIR /app# Copy your requirements.txt if you have oneCOPY requirements.txt .RUN pip install --upgrade pip && pip install -r requirements.txt
create a requirements.txt file.
fastapi==0.110.2uvicorn[standard]==0.29.0transformers==4.52.4torch==2.7.0numpy==1.26.4accelerate==0.29.0
Run it normally (standard execution)
# Build the imagedocker build -t python-bitNet .# Run the container with port forwarding and mounting your codedocker run -it -p 8888:8888 -v "$PWD":/app python-bitNet /bin/bash
Use DevContainer (I try this method)

- Windows: This one has quite a few steps, for those who like challenges


I say it's challenging because I tried it and got stuck for 2 weeks lol. On Linux, it's done in a flash. For anyone who wants to try, you need to have the following:

  • For Visual Studio, you need to install additional C++ components as follows:



  • Running PowerShell won't work - you have to run it in Developer Command Prompt for VS 2022 or Developer PowerShell for VS 2022.
    The regular Terminal doesn't set all the variables like Path properly, so you'll encounter errors like


Error C1083: Cannot open include file: 'algorithm': No such file or directory

Even though you try to set vcvarsall.bat x64, it's like hit or miss - sometimes it works, sometimes it doesn't.

Note: vcvarsall.bat is located in "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\Build\vcvarsall.bat"


  • Set python lib .tlb in path > if you don't include it, it will crash


fatal error LNK1104: cannot open file 'python312.lib'Add Python Lib to PATH for running C++ libraries used for AI Inference
And after the environment is ready

  • Let set Virtual environment


# Set ENVpython3 -m venv bitnet-env# orpython -m venv bitnet-env

  • Activate Virtual environment


# Linuxsource bitnet-env/bin/activate# Windows - Powershell.\bitnet-env\Scripts\Activate.ps1# Windows - CMD.\bitnet-env\Scripts\activate.bat

  • Install required libraries according to requirements.txt - if you're using the Linux/Docker approach, these dependencies are already included in the Container image.


fastapi==0.110.2uvicorn[standard]==0.29.0transformers==4.52.4torch==2.7.0numpy==1.26.4accelerate==0.29.0pip install --upgrade pip && pip install -r requirements.txtpip install git+github.com/huggingface/transfo…

Writing Code to Use Model from Hugging Face


After resolving all the ENV issues, let's Coding, I mentioned wanting to connect it with OpenWebUI, so I made 2 versions Command Line version / API version

- Command Line


Try writing code using

  • Transformers: For loading pre-trained models from Hugging Face
  • PyTorch: For model inference from transformers (at this point, the my machine specs that only use inference but not good for Train/Fine Tune) The points where PyTorch is used in several parts, such as
    - bfloat16: Uses less memory than FP16 but with less precision
    - return_tensors="pt": Specifies PyTorch tensor format
    - to(model.device): Enables GPU acceleration if CUDA is available


import torchfrom transformers import AutoModelForCausalLM, AutoTokenizermodel_id = "microsoft/bitnet-b1.58-2B-4T"# Load tokenizer and modeltokenizer = AutoTokenizer.from_pretrained(model_id)model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.bfloat16, force_download=True,)# Apply the chat template + Rolemessages = [ {"role": "system", "content": "You are a Senior Programmer."}, {"role": "user", "content": "Can you help me with a coding problem?"},]prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)chat_input = tokenizer(prompt, return_tensors="pt").to(model.device)# Generate responsechat_outputs = model.generate(**chat_input, max_new_tokens=50)response = tokenizer.decode(chat_outputs[0][chat_input['input_ids'].shape[-1]:], skip_special_tokens=True)print("\nAssistant Response:", response)The Command Line version - you'll see that Windows has many limitations
Another version actually adds a loop and keeps asking continuously until you type "Thank you BITNET" - you can see source code here

- API


Note I didn't research what libraries are available that can make our API directly connect to Open WebUI. Initially

I tried to see what connection standards Open WebUI supports first - for the Text Prompt part, it has OpenAI / Ollama sections.

Now I chose to go with OpenAI API because when I tried playing with dotnet semantic kernel before, it has the /v1/chat/completions pattern, so I tried starting from there and tried adding it in WebUI to see what paths it hits in our code.

From what I tested, I found there are 3 API endpoints at minimum that Open WebUI calls to us:

  • /v1/chat/completions
  • /v1/models
  • /health

For /v1/chat/completions, I just kept adding based on what it complained about + asked AI until I completed all 3 APIs like this.
import datetimeimport timeimport uuidfrom fastapi import FastAPI, Requestfrom fastapi.responses import JSONResponsefrom pydantic import BaseModelfrom typing import List, Dict, Optionalimport torchimport uuidfrom datetime import datetimefrom transformers import AutoModelForCausalLM, AutoTokenizerapp = FastAPI()# Load model and tokenizer at startupmodel_id = "microsoft/bitnet-b1.58-2B-4T"tokenizer = AutoTokenizer.from_pretrained(model_id)model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.bfloat16, force_download=True,)device = "cuda" if torch.cuda.is_available() else "cpu"model = model.to(device)class Message(BaseModel): role: str content: strclass ChatRequest(BaseModel): messages: List[Message] max_new_tokens: Optional[int] = 700class Choice(BaseModel): index: int message: Dict[str, str] finish_reason: strclass ChatResponse(BaseModel): id: str object: str created: int model: str choices: List[Choice]@app.post("/v1/chat/completions", response_model=ChatResponse)async def chat_completions(request: ChatRequest): # Prepare prompt using chat template prompt = tokenizer.apply_chat_template( [msg.dict() for msg in request.messages], tokenize=False, add_generation_prompt=True ) chat_input = tokenizer(prompt, return_tensors="pt").to(model.device) chat_outputs = model.generate(**chat_input, max_new_tokens=request.max_new_tokens) response = tokenizer.decode( chat_outputs[0][chat_input['input_ids'].shape[-1]:], skip_special_tokens=True ) # Return response in OpenAI-compatible format # return JSONResponse({ # "id": f"chatcmpl-{uuid.uuid4().hex[:12]}", # "object": "chat.completion", # "created": int(time.time()), # "model": model_id, # "choices": [ # { # "index": 0, # "message": { # "role": "assistant", # "content": response # }, # "finish_reason": "stop" # } # ] # }) return ChatResponse( id=f"chatcmpl-{uuid.uuid4().hex[:12]}", object="chat.completion", created=int(time.time()), model=model_id, choices=[ Choice( index=0, message={"role": "assistant", "content": response}, finish_reason="stop" ) ] )@app.get("/")def root(): """Root endpoint with API info""" return JSONResponse({ "message": "OpenAI-Compatible API for Open WebUI", "version": "1.0.0", "endpoints": { "models": "/v1/models", "chat": "/v1/chat/completions", "health": "/health" } })@app.get("/health")def health_check(): """Health check endpoint""" return JSONResponse({"status": "healthy", "timestamp": datetime.now().isoformat()})@app.get("/v1/models")def list_models(): """List available models""" return JSONResponse({ "data": [ { "id": model_id, "object": "model", "created": datetime.now().isoformat(), "owned_by": "microsoft", "permission": [] } ] })
When using it, I made it into Docker. During build, I was shocked by the size - almost 10 GB.

Tried using it for real and connecting it with Open WebUI - sometimes it gives okay answers, sometimes it hallucinates lol

But what's for sure is the CPU usage shoots up lol

That concludes my rough trial of running the Model, and if I find something better, I'll write another blog post. Feel free to reach out with suggestions. Oh, and don't force it on Windows - mine got squeezed by WSL2. Taking an old notebook and installing Linux to make a Local AI Inference Engine is still faster.

For all the code, I've uploaded it to Git: github.com/pingkunga/python_mi…

Reference


#python #EnglishBlog #SLM #microsoftBitNet #LocalAIModel

Green Energy Utilization in Biochar Production


A Sustainable Approach

In the era of sustainable development, green energy solutions are gaining significant attention as we look for ways to reduce carbon footprints and promote circular economies. One such innovative solution lies in biochar production, a process that not only benefits the environment but also provides a renewable energy source.

Biochar—a form of carbon-rich charcoal produced through the pyrolysis of organic materials—has become a critical component in various environmental and agricultural solutions. However, its production process, particularly biochar pyrolysis machine, offers an often-overlooked opportunity for energy recovery and efficiency.

The Role of Pyrolysis in Energy Recovery
The biochar production process involves heating organic matter, such as agricultural waste, forestry residues, or even municipal solid waste, in a low-oxygen environment. This process, known as pyrolysis, breaks down the material into three key products: biochar, bio-oil, and syngas (synthesis gas). While biochar is the desired product for soil enhancement and carbon sequestration, the other by-products—bio-oil and syngas—serve as potential energy sources.

Biochar pyrolysis machines play a central role in this process. These machines are designed to efficiently convert organic materials into biochar while recovering the energy released during pyrolysis. The bio-oil and syngas produced can be utilized to power the machine itself or be harnessed for other industrial processes, such as electricity generation or heating. This recovery of energy reduces reliance on external power sources, making the biochar production process significantly more sustainable.

Circular Economy and Biochar Production
The concept of a circular economy revolves around maximizing the use of resources while minimizing waste. In biochar production, the pyrolysis process embodies this principle by utilizing waste materials (such as agricultural or forest residues) and converting them into valuable products like biochar, while simultaneously recovering energy in the form of syngas and bio-oil.

By integrating the energy recovery capabilities of biochar pyrolysis machines, manufacturers can achieve a closed-loop system where the energy required for production is largely self-sustained. The use of recovered syngas to fuel the pyrolysis process can significantly reduce energy consumption, thereby lowering operational costs and carbon emissions associated with external energy sources.

Environmental Benefits and Sustainability
Not only does biochar serve as an effective tool for carbon sequestration, but the process itself contributes to environmental sustainability. By capturing and storing carbon during pyrolysis, biochar helps mitigate the effects of climate change. Additionally, the energy recovery aspect of biochar production helps reduce the environmental impact of the process. By utilizing biochar pyrolysis machines, industries can produce biochar while simultaneously reducing their reliance on fossil fuels and lowering their overall carbon emissions.

Conclusion
As the world continues to seek greener, more sustainable solutions, biochar production stands out as a promising technology. The integration of energy recovery mechanisms in biochar pyrolysis machines offers significant advantages, promoting a circular economy where waste is minimized, energy is recovered, and environmental benefits are maximized.

With increasing attention on sustainability, this innovative process could soon become a key player in both environmental management and renewable energy generation, paving the way for a cleaner, more efficient future.

Create a REST API for the Microsoft/BitNet B1.58 model and integrate it with an Open WebUI

Now I'm writing/semi-complaining lol. Normally I use models from Ollama, and I happened to come across this person's X (Twitter). They have a Microsoft model that people say can run on CPU just fine, and if you use something like M2, it'll be even faster. Microsoft just a 1-bit LLM with 2B parameters that can run on CPUs like Apple M2.

naiwaen.debuggingsoft.com/2025…


Create a REST API for the Microsoft/BitNet B1.58 model and integrate it with an Open WebUI


Now I'm writing/semi-complaining lol. Normally I use models from Ollama, and I happened to come across this person's X (Twitter). They have a Microsoft model that people say can run on CPU just fine, and if you use something like M2, it'll be even faster.

Microsoft just a 1-bit LLM with 2B parameters that can run on CPUs like Apple M2.

BitNet b1.58 2B4T outperforms fp LLaMA 3.2 1B while using only 0.4GB memory versus 2GB and processes tokens 40% faster.

100% opensource. pic.twitter.com/kTeqTs6PHd

— Shubham Saboo (@Saboo_Shubham_) April 18, 2025


And that model is microsoft/BitNet b1.58 2B4T. After seeing the news in APR-2025, I waited to see if anyone would try to implement it in Ollama. I saw people asking about it too, but there was still no update.

I waited for a long time until June-2025 and still nothing. Oh well, let me find a way to run it myself from the code then. Initially, I set a simple goal: put the model in a container and find something that can create an endpoint to work with Open WebUI (a web interface for chatting like ChatGPT), So This blog is to document the my experience for this experiment.

I you want to read Thai Version (อ่านได้ที่นี่)


Table of Contents


Getting Ready to Run microsoft/BitNet

- Linux (I actually tried this in Docker)


I take a Python image and install according to these instructions:
# Use official Python 3.12 imageFROM python:3.12-slim# Install system dependencies for PyTorch and build toolsRUN apt-get update && \ apt-get install -y --no-install-recommends \ build-essential \ cmake \ git \ curl \ ca-certificates \ libopenblas-dev \ libomp-dev \ libssl-dev \ libffi-dev \ wget \ && rm -rf /var/lib/apt/lists/*# (Optional) Set a working directoryWORKDIR /app# Copy your requirements.txt if you have oneCOPY requirements.txt .RUN pip install --upgrade pip && pip install -r requirements.txt
create a requirements.txt file.
fastapi==0.110.2uvicorn[standard]==0.29.0transformers==4.52.4torch==2.7.0numpy==1.26.4accelerate==0.29.0
Run it normally (standard execution)
# Build the imagedocker build -t python-bitNet .# Run the container with port forwarding and mounting your codedocker run -it -p 8888:8888 -v "$PWD":/app python-bitNet /bin/bash
Use DevContainer (I try this method)

- Windows: This one has quite a few steps, for those who like challenges


I say it's challenging because I tried it and got stuck for 2 weeks lol. On Linux, it's done in a flash. For anyone who wants to try, you need to have the following:

  • For Visual Studio, you need to install additional C++ components as follows:



  • Running PowerShell won't work - you have to run it in Developer Command Prompt for VS 2022 or Developer PowerShell for VS 2022.
    The regular Terminal doesn't set all the variables like Path properly, so you'll encounter errors like


Error C1083: Cannot open include file: 'algorithm': No such file or directory

Even though you try to set vcvarsall.bat x64, it's like hit or miss - sometimes it works, sometimes it doesn't.

Note: vcvarsall.bat is located in "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\Build\vcvarsall.bat"


  • Set python lib .tlb in path > if you don't include it, it will crash


fatal error LNK1104: cannot open file 'python312.lib'Add Python Lib to PATH for running C++ libraries used for AI Inference
And after the environment is ready

  • Let set Virtual environment


# Set ENVpython3 -m venv bitnet-env# orpython -m venv bitnet-env

  • Activate Virtual environment


# Linuxsource bitnet-env/bin/activate# Windows - Powershell.\bitnet-env\Scripts\Activate.ps1# Windows - CMD.\bitnet-env\Scripts\activate.bat

  • Install required libraries according to requirements.txt - if you're using the Linux/Docker approach, these dependencies are already included in the Container image.


fastapi==0.110.2uvicorn[standard]==0.29.0transformers==4.52.4torch==2.7.0numpy==1.26.4accelerate==0.29.0pip install --upgrade pip && pip install -r requirements.txtpip install git+github.com/huggingface/transfo…

Writing Code to Use Model from Hugging Face


After resolving all the ENV issues, let's Coding, I mentioned wanting to connect it with OpenWebUI, so I made 2 versions Command Line version / API version

- Command Line


Try writing code using

  • Transformers: For loading pre-trained models from Hugging Face
  • PyTorch: For model inference from transformers (at this point, the my machine specs that only use inference but not good for Train/Fine Tune) The points where PyTorch is used in several parts, such as
    - bfloat16: Uses less memory than FP16 but with less precision
    - return_tensors="pt": Specifies PyTorch tensor format
    - to(model.device): Enables GPU acceleration if CUDA is available


import torchfrom transformers import AutoModelForCausalLM, AutoTokenizermodel_id = "microsoft/bitnet-b1.58-2B-4T"# Load tokenizer and modeltokenizer = AutoTokenizer.from_pretrained(model_id)model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.bfloat16, force_download=True,)# Apply the chat template + Rolemessages = [ {"role": "system", "content": "You are a Senior Programmer."}, {"role": "user", "content": "Can you help me with a coding problem?"},]prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)chat_input = tokenizer(prompt, return_tensors="pt").to(model.device)# Generate responsechat_outputs = model.generate(**chat_input, max_new_tokens=50)response = tokenizer.decode(chat_outputs[0][chat_input['input_ids'].shape[-1]:], skip_special_tokens=True)print("\nAssistant Response:", response)The Command Line version - you'll see that Windows has many limitations
Another version actually adds a loop and keeps asking continuously until you type "Thank you BITNET" - you can see source code here

- API


Note I didn't research what libraries are available that can make our API directly connect to Open WebUI. Initially

I tried to see what connection standards Open WebUI supports first - for the Text Prompt part, it has OpenAI / Ollama sections.

Now I chose to go with OpenAI API because when I tried playing with dotnet semantic kernel before, it has the /v1/chat/completions pattern, so I tried starting from there and tried adding it in WebUI to see what paths it hits in our code.

From what I tested, I found there are 3 API endpoints at minimum that Open WebUI calls to us:

  • /v1/chat/completions
  • /v1/models
  • /health

For /v1/chat/completions, I just kept adding based on what it complained about + asked AI until I completed all 3 APIs like this.
import datetimeimport timeimport uuidfrom fastapi import FastAPI, Requestfrom fastapi.responses import JSONResponsefrom pydantic import BaseModelfrom typing import List, Dict, Optionalimport torchimport uuidfrom datetime import datetimefrom transformers import AutoModelForCausalLM, AutoTokenizerapp = FastAPI()# Load model and tokenizer at startupmodel_id = "microsoft/bitnet-b1.58-2B-4T"tokenizer = AutoTokenizer.from_pretrained(model_id)model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.bfloat16, force_download=True,)device = "cuda" if torch.cuda.is_available() else "cpu"model = model.to(device)class Message(BaseModel): role: str content: strclass ChatRequest(BaseModel): messages: List[Message] max_new_tokens: Optional[int] = 700class Choice(BaseModel): index: int message: Dict[str, str] finish_reason: strclass ChatResponse(BaseModel): id: str object: str created: int model: str choices: List[Choice]@app.post("/v1/chat/completions", response_model=ChatResponse)async def chat_completions(request: ChatRequest): # Prepare prompt using chat template prompt = tokenizer.apply_chat_template( [msg.dict() for msg in request.messages], tokenize=False, add_generation_prompt=True ) chat_input = tokenizer(prompt, return_tensors="pt").to(model.device) chat_outputs = model.generate(**chat_input, max_new_tokens=request.max_new_tokens) response = tokenizer.decode( chat_outputs[0][chat_input['input_ids'].shape[-1]:], skip_special_tokens=True ) # Return response in OpenAI-compatible format # return JSONResponse({ # "id": f"chatcmpl-{uuid.uuid4().hex[:12]}", # "object": "chat.completion", # "created": int(time.time()), # "model": model_id, # "choices": [ # { # "index": 0, # "message": { # "role": "assistant", # "content": response # }, # "finish_reason": "stop" # } # ] # }) return ChatResponse( id=f"chatcmpl-{uuid.uuid4().hex[:12]}", object="chat.completion", created=int(time.time()), model=model_id, choices=[ Choice( index=0, message={"role": "assistant", "content": response}, finish_reason="stop" ) ] )@app.get("/")def root(): """Root endpoint with API info""" return JSONResponse({ "message": "OpenAI-Compatible API for Open WebUI", "version": "1.0.0", "endpoints": { "models": "/v1/models", "chat": "/v1/chat/completions", "health": "/health" } })@app.get("/health")def health_check(): """Health check endpoint""" return JSONResponse({"status": "healthy", "timestamp": datetime.now().isoformat()})@app.get("/v1/models")def list_models(): """List available models""" return JSONResponse({ "data": [ { "id": model_id, "object": "model", "created": datetime.now().isoformat(), "owned_by": "microsoft", "permission": [] } ] })
When using it, I made it into Docker. During build, I was shocked by the size - almost 10 GB.

Tried using it for real and connecting it with Open WebUI - sometimes it gives okay answers, sometimes it hallucinates lol

But what's for sure is the CPU usage shoots up lol

That concludes my rough trial of running the Model, and if I find something better, I'll write another blog post. Feel free to reach out with suggestions. Oh, and don't force it on Windows - mine got squeezed by WSL2. Taking an old notebook and installing Linux to make a Local AI Inference Engine is still faster.

For all the code, I've uploaded it to Git: github.com/pingkunga/python_mi…

Reference


#python #EnglishBlog #SLM #microsoftBitNet #LocalAIModel


เมื่อกี้ตอนกลับบ้านสวนกับคอร์กี้ แล้วก็นึกขึ้นมาได้ว่าหมาในญี่ปุ่นที่เจอ น้ำหนมาตรฐานตามไกด์ไลน์แทบทุกตัว ไม่มี compromise ว่าอ้วนกว่านี้อีกหน่อยจะน่ารัก ถ้าคนไทยมาเห็นโกลเด้น ลาบ คอร์กี้ เชลตี้ฯลฯ แถวนี้ คงบ่นว่าน้องผอมน่าสงสาร

ปล. แต่่แมวน้ำหนักเกินนี่เห็นเรื่อยๆ นะ ถถถ

„Kugel ist schussbereit“: Vertrauter des Magdeburg-Amokfahrers kündigt Gewalttat an und taucht unter apollo-news.net/kugel-ist-schu… In einer MDR-Dokumentation, die sich mit dem Anschlag auf den Magdeburger Weihnachtsmarkt 2024 durch den saudischen Arzt Taleb al-Abdulmohsen beschäftigt, ...
The post „Kugel ist schussbereit“: Vertrauter des Magdeburg-Amokfahrers kündigt Gewalttat an und taucht unter appeared

ส่อง 25 ไอเดียแต่งตัวยีนส์ขาสั้น สวยชิค ลุคชิลล์ สบาย ๆ แบบมีสไตล์ digitalmore.co/25-denim-shorts…

ชวนมา Debian Release Party ที่กรุงเทพฯ!

มาร่วมฉลองการเปิดตัว Debian Release ใหม่ล่าสุดกับพวกเราที่กรุงเทพฯ!

อะไร: Debian Release Party - มา "กิน ดื่ม พูดคุย" แลกเปลี่ยนประสบการณ์และทำความรู้จักกับเพื่อนๆ ในคอมมูนิตี้ Debian

ที่ไหน: Jojo Soba ชั้น 2 (ดูแผนที่ได้ที่: openstreetmap.org/way/12212361…)

เมื่อไหร่: วันเสาร์ที่ 9 สิงหาคม พ.ศ. 2568 เวลา 14:00 น. เป็นต้นไป

อย่าพลาดโอกาสดีๆ ที่จะได้มารวมตัวกับคนคอเดียวกัน มาร่วมเป็นส่วนหนึ่งของการเฉลิมฉลองนี้กันนะ!

a bit fruity ตอนล่าสุดพูดถึงประเด็น lgbtq/trans rights คือมันมีอากิวเม้นนึงเกิดขึ้นในสังคมปัจจุบันว่าการเคลื่อนไหวของ lgbtq+ ในปัจจุบันเรียกร้อง "มาก" ไปไหม พยายามทำลายสังคมมากไปป่าว open.spotify.com/episode/52bYI…

ฟังแล้วลองตัดบริบทออกนี่รู้สึกว่าเป็น ep ที่ถามคำถามหลายอย่างกับตัวเรามาก

1. การเคลื่อนไหวเพื่อสิทธิพลเมืองทำยังไงถึงจะ~ถูกต้อง~ ทำยังไงให้สังคมยอมรับ การสร้างบทสนทนาเปิดกว้างมีประโยชน์จริงไหม เราควรเสียเวลาพูดคุยกับคนที่จิตใจเต็มไปด้วยอคติ (bigotry) มั้ย

2. เกิดอะไรขึ้นเมื่อเราบรรลุเป้าหมายแล้วแต่คนอื่นๆที่ไม่ได้อยู่ในจุดเดียวกับเรายังไปไม่ถึงจุดหมายนั้น

in reply to m

4. อันนี้อยู่ในบริบทอยู่ คือรู้สึกว่าเมกาเป็นประเทศที่หมกหมุ่นกับ trans มาก ซึ่งในรายการก็บอกว่าสื่ออะชอบเอาประเด็น trans มาอยู่ในสปอตไลท์ ซึ่งในมุมหนึ่งมันมีประเด็นทางสังคมและปัญหาจริง แต่อีกมุมหนึ่งคือ trans เป็นคนกลุ่มเล็กเมื่อเทียบกับประชากรในประเทศ ไม่ได้ใหญ่พอที่จะ threaten อะไรใครเลย

โดยส่วนตัวนะ เราเกิดในยุค 90 คิดว่าเรื่องเกย์เลสเบี้ยนนี่ชิวมาก แต่พอขยับมาเป็น nb+trans หลายๆครั้งยังมีเรื่องที่รู้สึกไม่เข้าใจ รู้สึกว่ามันเกินความรู้สึกนึกคิดของตัวเองอยู่มาก

in reply to m

เราคิดว่าถ้าเราอยากจะยืนข้างที่ถูกในประวัติศาสตร์ ประเด็นยากๆที่เรา uncomfortable กับมันเยอะๆเช่นประเด็นนี้ต่างหากที่เราต้องเรียนรู้แล้วก็หาข้อสรุปจุดยืนของตัวเอง

ซึ่งตัวเราก็ยังไม่ได้มีความกล้าหรือหนักแน่นที่จะสร้างความเปลี่ยนแปลงอะไรขนาดนั้น แต่ก็จะพยายามต่อไป 😭

อ้างอิงจากบทความในหนังสือเรียน ช่วงก่อนหน้านี้ญี่ปุ่นเคยพยายามจะแก้ปัญหาเด็กกวดวิชาหนักเกินโดยการใช้แนวทางการศึกษา ゆったり教育 หรือการศึกษาแบบชิลๆ เป็นทางเลือกสำหรับพ่อแม่ที่ไม่อยากให้ลูกเรียนหนัก ตัวอย่างคลาสสิกคือ pi =3.142 หรือ 22/7 มันยากไป ให้เรียนแบบ pi =3 ถ้วนๆ ไปเลย
in reply to rarirurero

เราว่าไอเดียตั้งต้นให้เรียนแบบไม่รีบมันดี แต่วิธีออกแบบการสอนมันไม่ควรไปทำให้ pi มันเลขน้อยลงไหม แต่ควรใช้เวลาเยอะๆ กับการทำความเข้าว่า pi มันมายังงัย ถ้าเด็กโปรแกรมปกติเรียนหาพื้นที่กับเส้นรอบวงในหนึ่งคาบ เราเรียน 3-4 คาบก็ได้ แต่ใช้เวลาทำความเข้าใจว่า 22/7 มาจากไหน เล่น ゆったりแบบ pi=3 เลยพังหนัก gap เด็กห่างกว่าเดิม เลยเลิกไปเมื่อทศวรรษที่แล้ว

สรุปแผน Debian Release Party

เมือง: Bangkok, Thailand
กิจกรรม: กิน ดื่ม พูดคุย
สถานที่: Jojo Cafe
เวลา: 14:00น. 9 สิงหาคม พ.ศ. 2568

แต่ว่ายังสมัคร wiki.debian.org/ReleasePartyTr… ไม่ได้ ถ้าเป็นไปได้และถ้าคุณ @thep เข้าได้อยู่แล้วอยากจะขอรบกวนด้วยครับ

@thep
in reply to Vee: ดิจิทัล

มีอะไรจะเติมใน field ต่างๆ ไหมครับ?
* '''When:''' 9 August 2025, 14:00 local time
* '''Where:''' Jojo Cafe ([[https://www.openstreetmap.org/way/1221236172|map]])
* '''What:''' eat, drink, chat
* '''Provided:''' anything provided by the organisers like cake, balloons, food, drink, silly hats
* '''Bring:''' anything people should bring, like food, drink, costumes, OpenPGP fingerprints, etc
* '''More info:''' link to the page where people can get more information and/or discuss event
in reply to thep

''Provided:''' ไม่มีอะไรให้เลยครับ แต่ถ้าจะส่งมาก็ได้นะครับ

'''Bring:''' ไม่มีเหมือนกันครับ มาตัวเปล่ากับเงินค่ากาแฟก็พอ

* '''Promotion:''' [Thai] mastodon.in.th/@veer66/1149014…
[English] mstdn.io/@veer66/1149014882292…

* '''Reports:''' thailinuxusergroup.freeforums.…


ชวนมา Debian Release Party ที่กรุงเทพฯ!

มาร่วมฉลองการเปิดตัว Debian Release ใหม่ล่าสุดกับพวกเราที่กรุงเทพฯ!

อะไร: Debian Release Party - มา "กิน ดื่ม พูดคุย" แลกเปลี่ยนประสบการณ์และทำความรู้จักกับเพื่อนๆ ในคอมมูนิตี้ Debian

ที่ไหน: Jojo Soba ชั้น 2 (ดูแผนที่ได้ที่: openstreetmap.org/way/12212361…)

เมื่อไหร่: วันเสาร์ที่ 9 สิงหาคม พ.ศ. 2568 เวลา 14:00 น. เป็นต้นไป

อย่าพลาดโอกาสดีๆ ที่จะได้มารวมตัวกับคนคอเดียวกัน มาร่วมเป็นส่วนหนึ่งของการเฉลิมฉลองนี้กันนะ!