More

moochee · on Feb 13, 2024

I’m curious how to auto-renew lets encrypt certs, do you mind sharing the tips? Thanks

tacone · on Feb 14, 2024

I am using nginx-proxy (look it up on github) . It hasn't been that simple to put together, I will investigate MicroK8S in the future.

uolgoz · on Feb 17, 2024

traefik (https://traefik.io/traefik) is also pretty good at this. I've used it to get certs auto-renewed for my projects.

moochee · on Dec 23, 2023

This is super nice, thanks for sharing. Using gh issue => pr => deployment flow is good, but it would be awesome to have an optional local dev flow so the iterations can go even faster.

yz-yu · on Dec 24, 2023

yes, now I found the time mainly caused by: 1. openAI API call, which could not be optimized for now 2. the time to build and deploy, which can be optimized by a preset setup

moochee · on Dec 22, 2023

Is it similar to OpenAI GPT-4-Vision model?

moochee · on Dec 20, 2023

What problem to solve with LoftQ? I can’t tell after looking at the readme.

pajop · on Dec 20, 2023

from the paper: https://arxiv.org/pdf/2310.08659.pdf

"LoftQ aims to solve the problem of the discrepancy between the quantized and full-precision model in the context of quantization and LoRA fine-tuning for Large Language Models (LLMs). By simultaneously quantizing an LLM and finding a proper low-rank initialization for LoRA fine-tuning, LoftQ significantly enhances generalization in downstream tasks."

Bard: https://bard.google.com/chat/31e0a3bb74b29b3b

"Based on the abstract, LoftQ aims to solve the performance gap observed when applying both quantization and LoRA fine-tuning to a pre-trained Large Language Model (LLM).

Here's a breakdown of the problem and LoftQ's approach:

Problem:

Quantization: Reduces the precision of model weights to save memory and computation, but can lower accuracy. LoRA fine-tuning: Improves accuracy on specific tasks by adding a low-rank adapter, but can struggle with quantized models. Combined approach: Applying both quantization and LoRA fine-tuning often leads to a performance gap compared to full fine-tuning. LoftQ's solution:

Simultaneous quantization and LoRA initialization: LoftQ proposes a novel framework that quantizes the LLM while also finding a suitable low-rank initialization for LoRA. This helps bridge the gap between the quantized and full-precision model. Improved generalization: This approach improves the model's ability to generalize well on downstream tasks, especially in challenging memory-constrained settings. Evaluation and results:

LoftQ is tested on various NLP tasks like question answering and summarization. It outperforms existing quantization methods, particularly in low-precision scenarios like 2-bit and 2/4-bit mixed precision. Overall, LoftQ tackles the challenge of combining quantization and LoRA fine-tuning for LLMs, leading to better performance and efficiency, especially in resource-limited environments."

moochee · on Dec 20, 2023

Thanks. LoftQ = Quantization + LoRA fine-tuning. What's the difference between LoftQ vs QLoRA then?

tourzhao · on Dec 20, 2023

LoftQ = Quantization optimized for LoRA + Better LoRA Adaptor initialization + LoRA fine-tuning.

moochee · on Nov 22, 2023

Nice work, thanks for making it! A few nice-to-haves: 1. it didn’t mention in readme that ollama has to be started manually in terminal. But I figured that out. 2. have a short video showing how it works in coding work, especially for people who never used gh copilot.

ex3ndr · on Nov 22, 2023

Yeah, this is my plans to add as well. I want to refine it a little bit from coding standpoint first.

moochee · on Nov 20, 2023

Nice work. Tbh the tech stack should be open sourced so it could be easily picked up and improved by a larger community

moochee · on Nov 18, 2023

Thanks. Can you be more specific?

Woshiwuja · on Nov 20, 2023

More specific on what? Screen recording software and editing software?

moochee · on Nov 22, 2023

I found its hard to mirror my iphone screen to my macbook, any idea?

Woshiwuja · on Nov 27, 2023

probably an emulator is your best bet to record gameplay

moochee · on Nov 3, 2023

Do you any example app in github?

apitman · on Nov 3, 2023

You can use any app that supports OpenID Connect. Just set the client_id to the URL the app runs on.

moochee · on Oct 30, 2023

Nice work. I saw that the json file will be published to CDN, the json file may contain some sensitive data e.g. company id. How do you handle that?

moochee · on Oct 7, 2023

I’d like to recommend Insomnium, a 100% local-first Insomnia fork. https://archgpt.dev/insomnium