Hacker News new | past | comments | ask | show | jobs | submit login
Oversaid – Keeping track of who said what in tech (oversaid.com)
64 points by atestu on Nov 19, 2019 | hide | past | favorite | 23 comments



Hi all,

Over the past few months I have been building a data pipeline that extracts quotes and assigns them to company executives, using natural language processing (spacy).

I am focusing on the tech industry right now which I know pretty well (I used to work at CB Insights). Every Wednesday I put together a newsletter featuring interesting quotes and explain why they matter.

Eventually, Oversaid will be a platform that lets you search quotes yourself and come up with your own analysis, but I wanted to launch something early while I build out the product so that I can learn from prospective customers.

I'm having a lot of fun writing these up and I hope you enjoy reading them!

Here's the last email I sent: https://mailchi.mp/ab609ce94c60/bill-gates-gives-bezos-some-...


I think it would be useful to have the latest email (or any email that you sent in the past) available somewhere on the website as well.

If it wasn't for your comment here, I wouldn't be able to see an example.

On a more personal note about the email, I'd drop the first paragraph, or make it shorter and focus on what you promised to send.

One great daily mail that I've subscribed and really enjoy to read is from the daily stoic [0]. They are quite short and to the point everytime.

[0] - https://dailystoic.com


Thanks I'll check out that email!

The intro will be shorter from now on, that last one was still intended for a small audience who was helping me iterate on the newsletter.


Hey, this looks kind of nice actually. But why hide the content behind an email sign-up, why not also provide an RSS feed, or at least the latest 1-3 send-outs? If you hadn't provided that link to the latest e-mail I would've brushed it off entirely and not given it a moments more thought. But you did provide that link, and I found the content relevant and interesting, and even dove deeper into a couple of the articles.

The content in this one e-mail is pretty good, but before I sign up I'd like to see more. But also it's not clear to me that I'm your customer, because reading your post here it sounds like your customer isn't the consumer of these feeds but actually the producers? I'm not one of them, so why do you care about my e-mail address, other than to sell to your actual customers down the line?


Hi, fair question!

The main reason I'm starting this newsletter is to start conversations with people about this data, which is why I want people's emails.

Right now I'm hoping people are enticed enough to put their email address to check it out. I might be wrong about that and I might add a link to the latest email on the home page in the future to see if signups improved.

I have no intention of selling those emails. My goal is to start a Saas platform based on this data. A very small number of subscribers will hopefully become clients. The rest can just enjoy the newsletter and help me get the word out if they feel like it.


from what the OP says, it seems that this would eventually be a paid service (with a free newsletter) for people (maybe journalists) to easily find quotes/tidbits so that they can... well, do whatever they'd like with that information.

I don't think the OP is gonna sell your email.


Thank you for going after primary sources.

The sample letter refers to the press. Executives tend to be general with the press. Investors make them talk specifics. That happens during quarterly earning calls. Which would be a good source. Seeking Alpha has the transcripts.

The SEC filings are another liable source. Though not personal quotes, they include perspectives on tech trends.


Yep - 2 things about this

1) I am using the press as a kind of filter of what's interesting (hypothesis: they pick the most interesting thing executives are saying on earning calls)

2) I am working on extracting other sources like earning transcripts and twitter, but this will be tougher for me to go through due to the sheer amount of things that are said in those sources

SEC is a good one too, the management analysis section of filings is always interesting so I could extract that.

Thanks for the feedback!


I must say I found the "irrelevant commentary" rather entertaining, and I think it really enhances the value of the content. it's just the right amount to be fun, yet not overwhelming, keep at it!

boomshakalaka!


Interesting project! Do you have some info to share about the project architecture and technologies used? :3


Thanks! Yea the backend is all python (using spacy.io for NLP), and I built an admin in Rails to search through the quotes and correct the quotes when needed. Everything is stored in postgres.

I initially thought I would be able to manually review each quote (which was the main reason I built the admin), but it quickly became obvious that it was just too many quotes so I had to improve quote detection and attribution.

Big thanks to whomever wrote the spacy doc, I knew nothing about NLP (and I still know next to nothing tbh), but I was able to start using it pretty quickly.


Sorry in advance for the rather harsh feedback, that follows. It’s basically my unfiltered impression when I visited your site before reading your introduction comment here.

This is one of the occasions, when I love the German “every website needs a ‘Who’s responsible for this?’ page” law (a.k.a. “Impressumspflicht”). Why should I trust your specific selection of quotes and their interpretation? You even don’t trust your visitors with a “Who are we?” section. For all I know, this could be a Chinese or Russian troll factory outlet sale.

For this to work (for me, at least) you need to work _way_ more on the site’s transparency than a more or less default privacy disclaimer and an e-mail input form: Who am I, what criteria and sources are used for the quotes, how are they categorized, what do I do to prevent bias... The technology may as well be sound and state-of-the-art, but if I don’t trust the website, I won’t sign up to anything.


Hi -- thanks for the candid feedback, that's why I'm posting on HN :)

I think that's a good point, I will add an "about" page. I was thinking of adding a "how it works" page as well, I'll work on that.



I don't know if it's a transparency issue rather than an editorial perspective. I don't really know or care who gave me my CNN/Fox/BBC/ITV news, but I care that it has a consistent and trustworthy editorial perspective, that may be carried over multiple people.

I guess transparency and a greater overarching perspective both work.


Does the commentator matter if the quotes are sourced?


Yes, because it may show the commentators intention and may offer some context of the interpretation.

Also, a lot of people don't read the source. Or know the context.


Love this idea. And congrats Alex - CB Insights Mafia!

Things that I think folks (aka me) would find interesting:

1. Impressions/views on tech markets (what they're entering, their views on growth of markets, etc)

2. Views on competition esp if they talk isht

3. Quotes with data. Because some of these products by tech cos are opaque with stats, if execs drop a figure about growth, that's valuable.

4. Over time, it'd be cool if you could see what products execs talk about in their public comments as that might give an indication of what they're focused on. Suspect they might not talk a lot about specific products so perhaps wishful thinking.

IMO, personally, I think quotes on politics are boring mainly cuz that doesn't give insight into the biz and cuz politics is already everywhere.

Enjoying the emails so far and look forward to seeing where Oversaid goes

Congrats again.


Thanks Anand!

Love idea #3, I'll have to dig in and see if there's enough of that. Might be info that's in the article and not in the quote itself.

I'm excited to get into #4 as I go through historical sources and extract old quotes.


For #3, you may want to include guest engineering blog posts shared to highscalability [1]. There are sometimes growth hints and stats included in the posts.

http://highscalability.com/blog/category/example


Perfect! Thank you


Rich people thoughts delivered directly to your email, oh boy sign me up.


I want to have some fun. There it goes my email...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: