Releases: truera/trulens
trulens-1.4.3
What's Changed
Bug Fixes
- Filipg/fix hotspots for release by @sfc-gh-fgralinski in #1821
- LiteLLM fixes by @sfc-gh-jreini in #1829
- Don't track feedback functions. by @sfc-gh-dkurokawa in #1833
- Use context variables instead of thread_local since the latter doesn't actually inherit values from the parent threads. by @sfc-gh-dkurokawa in #1838
- remove redundant header in streamlit pills by @sfc-gh-jreini in #1839
Docs
- change link from slack to snowflake discourse by @sfc-gh-jreini in #1835
- update the README with community link by @sfc-gh-jreini in #1841
Otel and other updates
- Fix issue where for
llama-index
apps, we don't handle the app specific record root correctly. by @sfc-gh-dkurokawa in #1810 - Redefine span attribute dictionary so that the values of the dictionary map to kwargs of the function decorated or the return. by @sfc-gh-dkurokawa in #1826
- Replace
attributes
withfull_scoped_attributes
. by @sfc-gh-dkurokawa in #1827 - Rename
main_input
/main_output
/main_error
toinput
/output
/error
. by @sfc-gh-dkurokawa in #1828 - SDK bugbash action items by @sfc-gh-dhuang in #1830
- SDK - run.describe() to return Run object directly #1831 by @sfc-gh-dhuang in #1832
- Clean up tests a bit and add a test to check if span attributes fail, then everything fails. by @sfc-gh-dkurokawa in #1837
- Handle the possible creation of
TruSession
whenconnector
argument is given to the tru app. by @sfc-gh-dkurokawa in #1842 - SDK run.start() should take ground_truth_output from dataset_spec by @sfc-gh-dhuang in #1843
Full Changelog: trulens-1.4.2...trulens-1.4.3
TruLens 1.4.2
What's Changed
- [SNOW-1901834] SDK: Run APIs: CREATE, GET, LIST, and DELETE and run.start() by @sfc-gh-dhuang in #1784
Full Changelog: trulens-1.4.1...trulens-1.4.2
Trulens v1.4.1
What's Changed
- Bugfix: Avoid ingestion of incorrectly generated spans. by @sfc-gh-apgupta in #1809
- Clean up semantic conventions. by @sfc-gh-dkurokawa in #1811
- Don't output a lot of span attributes if they're
None
. by @sfc-gh-dkurokawa in #1812 - Set default host for snowflake connector if not provided by @sfc-gh-dhuang in #1814
- Test conda build in PR pipeline. by @sfc-gh-dkurokawa in #1813
- Ensure there's only one record root span and also discern the main method if none given for
TruApp
/TruCustomApp
. by @sfc-gh-dkurokawa in #1818 - OTEL -> Otel to follow general naming conventions and keep things consistent. by @sfc-gh-dkurokawa in #1819
New Contributors
- @sfc-gh-apgupta made their first contribution in #1809
Full Changelog: trulens-1.4.0...trulens-1.4.1
TruLens v1.4.0
What's Changed
- Create very general function to compute feedbacks. by @sfc-gh-dkurokawa in #1794
- even quieter logging by @sfc-gh-jreini in #1795
- Create distributed OTEL test. by @sfc-gh-dkurokawa in #1789
- Trulens hotspots by @sfc-gh-fgralinski in #1757
- Clean up tests a bit and remove
ai.observability.domain
as a span attribute. by @sfc-gh-dkurokawa in #1796 - Update semantic convention for EVAL and EVAL_ROOT and update test to use this. by @sfc-gh-dkurokawa in #1797
- SDK app wrapper (EXTERNAL_AGENT ) CRUD by @sfc-gh-dhuang in #1772
- Create Snowflake E2E version of the feedback computation test. by @sfc-gh-dkurokawa in #1798
- Allow for
ground_truth_output
to be specified to recorder and also clean up semantic conventions by @sfc-gh-dkurokawa in #1799 - add llamaindex/langchain conda build recipes by @sfc-gh-chu in #1767
- Create OTEL span processor that can add trulens span attributes into spans not originated by trulens. by @sfc-gh-dkurokawa in #1800
- Make main_method required only for TruChain and TruLlama by @sfc-gh-dhuang in #1802
- Set up new context managers to record. by @sfc-gh-dkurokawa in #1805
New Contributors
- @sfc-gh-fgralinski made their first contribution in #1757
Full Changelog: trulens-1.3.5...trulens-1.4.0
TruLens 1.3.5
What's Changed
- Fix
snowflake-sqlalchemy
autocommit issue. by @sfc-gh-dkurokawa in #1792
Full Changelog: trulens-1.3.4...trulens-1.3.5
TruLens 1.3.4
What's Changed
- Validation for sis dashboard names by @sfc-gh-chu in #1750
- Handle sync and async generators with the OTEL
@instrument
decorator. by @sfc-gh-dkurokawa in #1748 - Require version of snowflake ml python >= 1.7.2 by @sfc-gh-dhuang in #1752
- Write returns for "UNKNOWN" spans. by @sfc-gh-dkurokawa in #1753
- Set up basic E2E test. by @sfc-gh-dkurokawa in #1754
- Handle sync functions that aren't generators but return a generator. by @sfc-gh-dkurokawa in #1755
- Track OpenAI and Cortex costs in a similar manner to the non-OTEL world. by @sfc-gh-dkurokawa in #1756
- Remove some old OTEL logic. by @sfc-gh-dkurokawa in #1759
- Have llama-index emit context-retrieval spans. by @sfc-gh-dkurokawa in #1761
- Change base scope from "ai_observability" to "ai.observability". Also… by @sfc-gh-dkurokawa in #1762
- Test guardrails with OTEL setup. by @sfc-gh-dkurokawa in #1764
- Make OTEL flow for users more natural by only having to set an environment variable and importing from non-experimental code. by @sfc-gh-dkurokawa in #1766
- Add in Snowflake OTEL e2e tests for tru_chain and tru_llama. by @sfc-gh-dkurokawa in #1768
- clean up old benchmark notebooks by @sfc-gh-jreini in #1760
- Migrate TruCustomApp to TruApp (backward compatible) by @sfc-gh-dhuang in #1770
- Set kwargs, return, and exception on all spans in the
ai.observability.call
span attribute scope. by @sfc-gh-dkurokawa in #1777 - Patch groundedness configs by @sfc-gh-jreini in #1778
- fix broken link on homepage by @sfc-gh-jreini in #1779
- Disable python <3.9 by @sfc-gh-chu in #1780
- Add in some minor test improvements. by @sfc-gh-dkurokawa in #1769
- make main_method required for app base class for OTel by @sfc-gh-dhuang in #1771
- Quieter Instrumentation by @sfc-gh-jreini in #1783
- Create OTEL notebook tests. by @sfc-gh-dkurokawa in #1782
- Run an E2E notebook on Snowflake notebooks (via staging the trulens packages). by @sfc-gh-dkurokawa in #1785
- Cost track
litellm.completion
. by @sfc-gh-dkurokawa in #1781 - Use new SPROC and UDTF, and verify more for OTEL Snowflake exporter. by @sfc-gh-dkurokawa in #1786
- Use
pytest
for grouping required/optional tests. by @sfc-gh-dkurokawa in #1788 - Remove
pytest.ini
as it's interfering with thepyproject.toml
. by @sfc-gh-dkurokawa in #1790
Full Changelog: trulens-1.3.3...trulens-1.3.4
TruLens 1.3.3
What's Changed
- Enable exporting spans to snowflake stage if a
TruLensSnowflakeSpanExporter
is provided by @sfc-gh-gtokernliang in #1708 - Allow
TruChain
/TruLlama
/TruRails
to use OTEL spans. by @sfc-gh-dkurokawa in #1727 - move
poetry-core
restrictions to az pipeline by @sfc-gh-chu in #1732 - Updated context relevance chain-of-thought prompting by @sfc-gh-dhuang in #1744
- Add Llama-Index support for OTEL by @sfc-gh-dkurokawa in #1743
- Write args as kwargs to unknown type OTEL spans by @sfc-gh-chu in #1745
- Add async support for OTEL by @sfc-gh-dkurokawa in #1746
- Add multithreading test for OTEL by @sfc-gh-dkurokawa in #1747
Bug Fixes
- Fix type hint for fewshot prompt construction by @sfc-gh-dhuang in #1725
- Add protobuf-related utilities by @sfc-gh-gtokernliang in #1720
- fix typo in snowflake feedbacks notebook by @sfc-gh-jreini in #1731
- groundedness: update user prompt to be consistent with system prompt by @sfc-gh-jreini in #1730
run_dashboard
to print Local URL by @sfc-gh-jreini in #1740
Full Changelog: trulens-1.3.2...trulens-1.3.3
TruLens 1.3.2
Bug Fixes
- Handle pydantic upgrade that now handles
model_fields
as aproperty
that can resolve to adict
when there's nothing. by @sfc-gh-dkurokawa in #1726 - Don't create event tables unless
TRULENS_OTEL_TRACING
env variable is set. by @sfc-gh-dkurokawa in #1724
Full Changelog: trulens-1.3.1...trulens-1.3.2
TruLens 1.3.1
What's Changed
- Update credit consumption table for Cortex LLM by @sfc-gh-dhuang : #1721
Bug Fixes
- Fix Snowflake SQL alchemy breaking behavior and ensure
AUTOCOMMIT
is enabled to fix trulens ingestion by @sfc-gh-dhuang #1719 - Fix and update Snowflake quickstart notebook by @sfc-gh-dhuang #1722
Full Changelog: trulens-1.3.0...trulens-1.3.1
TruLens 1.3.0
Optimizing Feedback Functions
In this release, we add important changes for improving the alignment of their LLM-Judge evals to human evaluations.
Global Improvement of Groundedness Feedback
The first is the global improvement of the groundedness feedback function (benchmarks and methods forthcoming). We invite any users to submit feedback (positive or negative) on the effectiveness of the new groundedness function using GitHub Issues or Discussions.
You can view the addition of new groundedness criteria in the GitHub diff below.
New levers for aligning feedback functions
The second change is that we add new easy-to-use levers for you to change the behavior of feedback functions using few-shot examples and custom criteria. Early customers have seen useful benefit in aligning their feedback functions to their collected expert evaluations using these levers.
Adding custom criteria to a feedback function
custom_criteria = """
A positive sentiment should be expressed with an extremely encouraging and enthusiastic tone.
"""
provider.sentiment(
"When you're ready to start your business, you'll be amazed at how much you can achieve!",
criteria=custom_criteria,
)
Adding few-shot examples to guide feedback functions
from trulens.feedback.v2 import feedback
fewshot_relevance_examples_list = [
(
{
"query": "What are the key considerations when starting a small business?",
"response": "You should focus on building relationships with mentors and industry leaders. Networking can provide insights, open doors to opportunities, and help you avoid common pitfalls.",
},
3,
),
]
provider.relevance(
"What are the key considerations when starting a small business?",
"Find a mentor who can guide you through the early stages and help you navigate common challenges.",
examples=fewshot_relevance_examples_list,
)
What's Changed
- Feedback customization (including few-shot examples) by @sfc-gh-jreini in #1674
- Custom criteria for feedback by @sfc-gh-jreini in #1705
- Update groundedness criteria (with more optimized prompt) by @sfc-gh-dhuang in #1710
- Allow existing tables to be used in ground truth datasets by @sfc-gh-dhuang in #1698
Bug Fixes
- Allow passthrough of feedback parameters including temperature, groundedness configs in the
Feedback
class by @sfc-gh-jreini in #1674 - Remove / retire sql instrumentation in Cortex Endpoint by @sfc-gh-dhuang in #1715
- Poetry < 2.0.0 by @sfc-gh-jreini in #1709
- Update docs to use postgres + psycopg in order to avoid known issues with psycopg2 by @sfc-gh-gtokernliang in #1701
- Update prpr example notebook to reflect latest Cortex provider API by @sfc-gh-dhuang in #1712
Preparations for Open Telemetry compatibility
- Introduce Event table for ORM to prepare for OTEL traces by @sfc-gh-gtokernliang in #1692
- Prototype OTEL exporter by @sfc-gh-gtokernliang in #1694
- Prototype @Instrument with OTEL by @sfc-gh-gtokernliang in #1693
- Move
main_input
,main_output
, and_extract_content
out of app.py by @sfc-gh-gtokernliang in #1706 - Move span-related validation + setting logic out of instrument.py by @sfc-gh-gtokernliang in #1707
Full Changelog: trulens-1.2.11...trulens-1.3.0