Skip to content

Commit d152a45

Browse files
* Support hallucination score, deepeval part
* Support hallucination score, selfcheckgpt part * Add workflow --------- Signed-off-by: Kang Wenjing <wenjing.kang@intel.com> Signed-off-by: yzheng124 <yi.zheng@intel.com> Co-authored-by: yzheng124 <yi.zheng@intel.com>
1 parent 9cfe8f6 commit d152a45

File tree

8 files changed

+384
-0
lines changed

8 files changed

+384
-0
lines changed

.github/reusable-steps/categorize-projects/action.yml

+7
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,8 @@ outputs:
1616
value: ${{ steps.group-subprojects.outputs.qt }}
1717
js:
1818
value: ${{ steps.group-subprojects.outputs.js }}
19+
unittest:
20+
value: ${{ steps.group-subprojects.outputs.unittest }}
1921

2022
runs:
2123
using: 'composite'
@@ -42,6 +44,8 @@ runs:
4244
qt+=("$dir")
4345
elif [ -f "$dir/main.py" ] && grep -q -- "--stream" "$dir/main.py"; then
4446
webcam+=("$dir")
47+
elif [ -d "$dir/test" ]; then
48+
unittest+=("$dir/test")
4549
else
4650
python+=("$dir")
4751
fi
@@ -53,13 +57,15 @@ runs:
5357
webcam_json=$(printf '%s\n' "${webcam[@]}" | jq -R -s -c 'split("\n") | map(select(length > 0))')
5458
qt_json=$(printf '%s\n' "${qt[@]}" | jq -R -s -c 'split("\n") | map(select(length > 0))')
5559
js_json=$(printf '%s\n' "${js[@]}" | jq -R -s -c 'split("\n") | map(select(length > 0))')
60+
unittest_json=$(printf '%s\n' "${unittest_json[@]}" | jq -R -s -c 'split("\n") | map(select(length > 0))')
5661
5762
echo "notebook=$notebook_json" >> $GITHUB_OUTPUT
5863
echo "python=$python_json" >> $GITHUB_OUTPUT
5964
echo "gradio=$gradio_json" >> $GITHUB_OUTPUT
6065
echo "webcam=$webcam_json" >> $GITHUB_OUTPUT
6166
echo "qt=$qt_json" >> $GITHUB_OUTPUT
6267
echo "js=$js_json" >> $GITHUB_OUTPUT
68+
echo "unittest_json=$unittest_json" >> $GITHUB_OUTPUT
6369
- name: Print subprojects to test
6470
shell: bash
6571
run: |
@@ -69,3 +75,4 @@ runs:
6975
echo "Webcam subprojects: ${{ steps.group-subprojects.outputs.webcam }}"
7076
echo "Qt subprojects: ${{ steps.group-subprojects.outputs.qt }}"
7177
echo "JS subprojects: ${{ steps.group-subprojects.outputs.js }}"
78+
echo "Unit test subprojects: ${{ steps.group-subprojects.outputs.unittest }}"

.github/workflows/sanity-check-demos.yml

+41
Original file line numberDiff line numberDiff line change
@@ -119,3 +119,44 @@ jobs:
119119
command: npm start
120120
project: ${{ matrix.subproject }}
121121
timeout: 1m
122+
123+
unittest:
124+
needs: find-subprojects
125+
if: ${{ needs.find-subprojects.outputs.unittest != '[]' }}
126+
runs-on: ${{ matrix.os }}
127+
strategy:
128+
fail-fast: false
129+
matrix:
130+
os: [ubuntu-latest, windows-latest, macos-latest]
131+
python: ["3.10", "3.12"]
132+
subproject: ${{ fromJson(needs.find-subprojects.outputs.unittest) }}
133+
steps:
134+
- uses: actions/checkout@v4
135+
- uses: ./.github/reusable-steps/setup-os
136+
- name: Set up Python ${{ matrix.python }}
137+
uses: actions/setup-python@v5
138+
with:
139+
python-version: ${{ matrix.python }}
140+
- uses: ./.github/reusable-steps/setup-python
141+
with:
142+
python: ${{ matrix.python }}
143+
project: ${{ matrix.subproject }}
144+
- name: Login to HF
145+
shell: bash
146+
run: |
147+
huggingface-cli login --token ${{ secrets.HF_TOKEN }}
148+
- name: Install ollama
149+
shell: bash
150+
run: |
151+
curl -fsSL https://ollama.com/install.sh | sh
152+
- name: Start ollama and setup deepeval to use ollama
153+
shell: bash
154+
run: |
155+
ollama serve &
156+
ollama pull deepseek-r1 &
157+
deepeval set-ollama deepseek-r1
158+
- uses: ./.github/reusable-steps/timeouted-action
159+
with:
160+
script: python test.py
161+
project: ${{ matrix.subproject }}
162+
timeout: 5h

demos/virtual_ai_assistant_demo/requirements.txt

+2
Original file line numberDiff line numberDiff line change
@@ -20,5 +20,7 @@ torch==2.5.1
2020
transformers==4.48.3
2121
pymupdf==1.24.10
2222
pyyaml==6.0.1
23+
selfcheckgpt==0.1.7
24+
deepeval==2.4.9
2325

2426
gradio==5.12.0
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# How to select hallucination computation algorithm
2+
3+
Currently, two methods are availble: [deepeval](#use-deepeval-to-compute-hallucination-score) and [selfcheckgpt](#use-selfcheckgpt-to-compute-hallucination-score).
4+
5+
If you have an evaluation dataset (i.e. both question and correct answer), you can choose [deepeval](#use-deepeval-to-compute-hallucination-score). However, if you do not have a labeled dataset, you can choose [selfcheckgpt](#use-selfcheckgpt-to-compute-hallucination-score). It will compute hallucination score based on the output consistency.
6+
7+
# Use deepeval to compute hallucination score
8+
## Prerequisite libraries
9+
1. [deepeval](https://github.com/confident-ai/deepeval)
10+
2. [Ollama](https://github.com/ollama/ollama/blob/main/README.md)
11+
12+
## How to set up
13+
1. Install deepeval:
14+
```
15+
pip install -U deepeval
16+
```
17+
2. Install Ollama:
18+
Please refer to [ollama](https://github.com/ollama/ollama/blob/main/README.md#ollama)
19+
20+
3. Run Ollama, taking `deepseek-r1` as an example:
21+
```
22+
ollama run deepseek-r1
23+
```
24+
4. Set deepeval to use Ollama for evaluation:
25+
```
26+
deepeval set-ollama deepseek-r1
27+
```
28+
29+
## How to run the test
30+
```
31+
python test.py --personality /path/to/personality.yaml --check_type deepeval
32+
```
33+
34+
## More to read
35+
[deepeval hallucination](https://docs.confident-ai.com/docs/metrics-hallucination)
36+
37+
# Use selfcheckgpt to compute hallucination score
38+
## Prerequisite libraries
39+
1. [selfcheckgpt](https://github.com/potsawee/selfcheckgpt)
40+
41+
## How to set up and run the test
42+
1. Install deepeval:
43+
```
44+
pip install selfcheckgpt==0.1.7
45+
```
46+
47+
2. Run test
48+
```
49+
python test.py --personality /path/to/personality.yaml --check_type selfcheckgpt
50+
```
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
Can you suggest some popular fruit-based drinks that are healthy and refreshing?
2+
Can you suggest some recipes using your favorite juices or ingredients?
3+
Can you suggest some refreshing drinks with watermelon or lime?
4+
Can you suggest some tropical juices or smoothies with kiwi or banana?
5+
What are the ingredients in a classic Martini?
6+
What are some popular drinks that use pomegranate juice?
7+
Can you suggest a cocktail that uses honey?
8+
What are the ingredients in a classic Daiquiri?
9+
Can you recommend a cocktail that uses apple cider?
10+
What are some popular drinks that use cranberry juice?
11+
Can you suggest a cocktail that uses chocolate?
12+
What are the ingredients in a classic Negroni?
13+
Can you recommend a cocktail that uses almond milk?
14+
What are some popular drinks that use grapefruit juice?
15+
Can you suggest a cocktail that uses lavender?
16+
What are the ingredients in a classic Pina Colada?
17+
Can you recommend a cocktail that uses maple syrup?
18+
What are some popular drinks that use lemon or lime juice?
19+
Can you suggest a cocktail that uses cinnamon?
20+
What are the ingredients in a classic Bloody Mary?
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
I'm planning to cook a classic spaghetti carbonara. What ingredients do I need?
2+
Can I substitute pancetta with bacon in my carbonara?
3+
I'm planning to make a vegan lasagna. What can I use instead of ricotta cheese?
4+
How long should I bake my lasagna for the best results?
5+
I'm making a chicken curry. What spices should I use for an authentic flavor?
6+
Can I use coconut milk instead of cream in my chicken curry?
7+
I'm planning to bake a chocolate cake. What type of cocoa powder is best?
8+
Can I use almond flour instead of all-purpose flour in my cake?
9+
I'm making a Caesar salad. What ingredients are essential for the dressing?
10+
Can I use Greek yogurt instead of mayonnaise in my Caesar dressing?
11+
I'm planning to cook a beef stew. What cut of beef is best for stewing?
12+
Can I use red wine instead of beef broth in my stew?
13+
I'm planning to cook a seafood paella. What types of seafood are best to use?
14+
Can I use brown rice instead of white rice in my paella?
15+
How do I achieve the perfect socarrat (crispy bottom) in my paella?
16+
I'm making a vegetarian chili. What beans are best to use?
17+
Can I add quinoa to my chili for extra protein?
18+
I'm planning to bake a batch of cookies. What type of sugar should I use?
19+
Can I substitute butter with coconut oil in my cookies?
20+
I'm making a Greek salad. What ingredients are essential?

0 commit comments

Comments
 (0)