Can we start in the builder and then extend with custom tools via APIs? Do you offer evals/analytics dashboards to trace failures and measure conversation quality? | discoverkit | discoverkit