Show HN: Oodle – serverless, fully-managed, drop-in replacement for Prometheus

blog.oodle.ai

61 points by kirankgollu 5 hours ago

Hello HN!

My co-founder, Vijay and I are excited to open up Oodle for everyone.

We used to be observability geeks at Rubrik. Our metrics bills grew like 20x over 4 years. We tried to control spend by getting better visibility, blocking high cardinality labels like pod_id, cluster_id, and customer_id. But that made debugging issues complicated. App engineers hated blocking metrics, and blocking others' code reviews was not fun for platform engineers either! Migrations (and lock-ins) were very painful, the first migration from Influx to Signalfx took 6+ months and the second migration from Splunk took over a year.

Oodle is taking a new approach to building a cost-efficient serverless metrics observability platform. It delivers fast performance at high scale. We leverage custom storage format on S3, tuned for metrics data. Queries are serverless. The hard part is how to achieve fast performance while optimizing for costs (every cpu cycle, storage/memory byte counts!). We've written about the architecture in more detail on our blog: https://blog.oodle.ai/building-a-high-performance-low-cost-m...

Try out our playground with 13M+ active time series/hr & 13B+ samples/day: https://play.oodle.ai

Explore all features with live data via Quick Signup: https://us1.oodle.ai/signup - Instant exploration (<5min): Run one command to stream synthetic metrics to your account - Easy integration (<15min): Explore with your data from existing Prometheus or OTel setup.

We’d love your feedback!

cheers

CubsFan1060 2 hours ago

The UI feels _very_ similar to Grafana. Even the dashboard folders look exactly the same to me. I would have thought that Grafana being AGPL woudl specifically forbid this?

Edit: Or maybe the AGPL just requires releasing any code you change? I could be confused.

  • kirankgollu 2 hours ago

    It’s indeed Grafana. We’ve been maintaing a public fork of Grafana.

    • CubsFan1060 2 hours ago

      Where do you keep the code?

      Found it: https://github.com/oodle-ai/grafana

      Why do this instead of just build a data source?

      Edit: Not to be that guy (but I'm about to be that guy). You have links to grafana.com (which, is your competitor), all over the source in your page. This also lists the version as 11.1.0, which was released 6-21.

      All of the versions in your fork repo mention 11.0.0-pre. Did I find the wrong repo, or are you using code that you haven't published?

      The reason I mostly care is that this is the sort of reason that good open source projects get closed down, and that makes me a bit sad.

      • mvijaykarthik an hour ago

        Oodle can be utilized solely as a datasource, but we also wanted to provide a solution for customers who don’t have a visualization platform in place.

        Here is the branch we use: https://github.com/oodle-ai/grafana/tree/v11.1.0-oodle-stabl..., which has all the changes we have made in Grafana.

        • sosodev 23 minutes ago

          So the vast majority of your fork is just rebranding? Customers get to lose thousands of commits worth of improvements for that?

          • mvijaykarthik 17 minutes ago

            We are reasonably close to the latest version of Grafana. We periodically pull in new changes.

TripleChecker 2 hours ago

Cool! The website says “No Lock-In” does it mean that I can bring by own compute and storage?

Also, found a few typos and a broken link, see error report here: https://triplechecker.com/s/xEd4Hp/oodle.ai?v=uxGS1

  • kirankgollu an hour ago

    Thanks for the report - we just deployed fix for the same.

    No lock-in means it’s 100% open source (PromQL) compatible. You can swap out vendors or move to self-hosted open source solutions should you need to move away from Oodle. When you migrate out, you get to export all your data, dashboards and alerts. you don't need to make any code changes.

    We support bringing your own bucket (BYOB) for large enterprise customers however, you cannot bring your own compute at this time. Our thoughts are along the lines of how Snowflake approached the problem - everything fully managed to keep the operational overhead minimal. https://jack-vanlightly.com/blog/2023/9/25/on-the-future-of-...

estebarb 2 hours ago

I'm wondering something: how is the storage/compactation solved? AFAIK S3 lacks append semantics, so data must be accumulated somewhere else before storing it. Kinesis?

  • mvijaykarthik an hour ago

    We use a local disk to temporarily stage data before putting it on S3. We have smaller WAL (write ahead log) objects, and a periodic compaction process which creates read-optimized files on S3.

infecto 3 hours ago

The logo on your main page for oodle.ai is blurry.

Why use a .ai domain? I love LLM but this is a turnoff to me.

  • mvijaykarthik an hour ago

    We are still early in our journey, and are currently working on leveraging LLMs for incidents and query / dashboard generation.

    We do use pre-LLM-era AI and statistical analysis to provide insights and auto create dashboards for alerts (currently in alpha).

nileshbansal 3 hours ago

We, at Workorb, migrated from Grafana to Oodle and very happy so far. Observability space does need a ground up reimagination and we think Oodle is positioned to do that.

  • ijidak 3 hours ago

    I'm curious, why did you move off of grafana?

    For the same reasons op listed or for other reasons?

suntracks 3 hours ago

Love the observability feature here. Would love to see a detailed feature set comparision along on the competetitors landscape

  • kirankgollu an hour ago

    Thanks for the kind words - we will be posting a feature comparison matrix in the upcoming weeks on our website.

alanfranz 3 hours ago

Some comparison to Thanos would be great!

  • mvijaykarthik 2 hours ago

    Great question! Vijay here, I'm one of the co-founders of Oodle. Compared to thanos 1. We use object store (S3) for all queries - even recent time ranges. Object store is not just an archival solution 2. Customized indexing to minimize memory usage. Index is also on object storage. 3. Custom columnar file format optimized for storing metrics on object storage 4. Serverless functions for achieving good query performance. This helps us break down and parallelise queries without impacting cost with pre-provisioned compute. 5. No downsampling. Downsampling is not required to improve query performance or reduce costs with serverless and object storage

  • verdverm 2 hours ago

    Yup, fan of the LGTM stack + Alloy

navjack27 4 hours ago

I don't know how trademark works or anything like that not a lawyer etc etc but lots of stuff are called oodle. I wish you luck.

  • kirankgollu 4 hours ago

    Thanks for the heads up. we did check on IP/trademarks just to be sure to avoid violations.

    • bayarearefugee 2 hours ago

      Oodle is a registered trademark:

      https://uspto.report/TM/88478792

      RAD is now owned by Epic Games (acquired in 2021) so they have very deep pockets.

      A lot of people, including myself, were clearly initially confused that there must be some association given you are using this name in a not-entirely-unrelated field.

      IANAL but I hope you're real sure that you are legally in the clear before you commit too deeply to the name.

thinkmassive 3 hours ago

Is it SaaS-only?

  • bradfitz 3 hours ago

    If it weren't, then you'd need servers and it couldn't be serverless! :)

    • mike_d 3 hours ago

      "Serverless" is an overloaded term marketing that really means functions-as-a-service. Looking at the stack, I don't actually see any components that you couldn't easily port to an on-prem solution.

  • kirankgollu 3 hours ago

    yes, it's only fully managed at this time. However, oodle is very cost-efficient, it's cheaper than your self-hosted infra costs. https://oodle.ai/usecases/self-hosted

    • sosodev 19 minutes ago

      I would love to see an actual breakdown of oodle vs self hosted costs. I seriously doubt that it’s cheaper.

    • someguy101010 3 hours ago

      any plans to open source? I feel very comfortable using neon.tech (separates compute from storage for postgres) b/c they open source their stack but it would be hard for me to adopt something like oodle without an open source version.

manishsharan 3 hours ago

I have been meaning to ask the observability experts this question:

Why not dump all metrics , events and logs into Clickhouse ? and purge data as necessary? For small to medium sized businesses/solution ecosystem, will this be be enough ?

  • iampims 3 hours ago

    It'll work. Clickhouse has even experimental support for storing prometheus metrics natively. A big missing piece is alerting.

    • kirankgollu an hour ago

      ClickHouse is great for logs and traces, however, for metrics, it is still in the early phase. ClickHouse is also a general purpose, real time analytics database. See clickhouse.com. Whereas Oodle is specifically built for end-to-end metrics observability.

devmor 2 hours ago

Why did you name your startup the same name as the most popular network compression library for video games? This seems short sighted. Even if you don't run afoul of trademark/copyright, you're sharing a lot of SEO and marketing terminology.

  • kirankgollu 2 hours ago

    Point taken. Thanks for the feedback. Our reasoning is that we’d like the name to be short and memorable. And a bunch of observability companies have observe keyword overloaded all over the place, we wanted our name to stand out. Oodle = Optimized Observability Data Lake.

sidd22 2 hours ago

[flagged]

  • kirankgollu an hour ago

    Our P99 query latency is under 3 seconds, we have tested up to 100M unique time series / hour and the architecture can scale up to billion time series / hour. To get a feel of the performance at high scale, give us a try at https://play.oodle.ai

shivarevanth 2 hours ago

[flagged]

  • kirankgollu an hour ago

    With our custom columnar format and indexes, we are able to filter relevant data files where high cardinality column is present. This helps us to keep the queries faster for high cardinality labels as well, thus, allowing us to quickly drill down on specific pod_id/cluster_id/customer_id kind of labels.

weego 3 hours ago

Why is the primary sales call to action is that it's serverless if it's a hosted solution? Who cares.

  • memset 2 hours ago

    Because at the early stages it’s really important to talk to customers.

    This also helps find users for whom this is a huge pain point - metrics costs are so high that they’d love to talk to someone and complain about the problem.

    • beaker52 2 hours ago

      “fully-managed, cheap metrics, ideal for serverless applications”

protocolture 3 hours ago

"fully managed, serverless"

So its not really a drop in replacement for prometheus then, its more of a send all your data to some other bloke kind of replacement.

Software as a service is fine, but you dont need to hide it behind hip marketing terminology.

  • mvijaykarthik an hour ago

    Technically you are correct, the scraper will still exist. However, the hard part is scaling the query and storage layers which we replace.