代写 RavenPack and Tiingo
In this assignment, we’ll learn the basics of analyzing/thinking about alternative data.
What to submit
1.What to submit: A writeup and code. The writeup may be an .ipynb file with embedded output + writeup.
2.You must use code. Python, R, Julia are acceptable. NO VBA, Excel etc.
3.
You need data in the common_goods database, which I’ve stripped down to be simpler versions of RavenPack and Tiingo. Note that I will not put the data in MySQL. Should be available in StarRocks or Clickhouse, and probably Parquet.
Questions
1.Describe the Tiingo dataset and Ravenpack datastes along the lines of
When they began in business
Tiingo Coverage universes
Fields / metrics
Are there any advantages to Tiingo?
Optional: licenses – how do they license their data? To make it easier, I crossed this out but still worthwhile thinking about
2.What is the lag between news and RavenPack when the news breaks out versus when observable by a fund? (For RP you can just look up the paper https://knowledge.wharton.upenn.edu/article/high-frequency-trading-profiting-news/)
What about Tiingo? For the former, you have to just read. For the latter you can calculate it from crawl versus publish date. common_goods.tiingo_news
3.Given the paper we discussed in class, with Tiingo can you implement the “News Momentum” paper?
4.How many publishers/websites does RavenPack cover? What about Tiingo? For RavenPack I am referring to the SOURCE_NAME, for Tiingo I am referring to the URL top level domain.
Calculate the top 15 publishers by article count and point out one in Tiingo’s top versus RavenPack, within the year 2021
common_goods.rp_2021 (year 2021 data)
tingle.news (use the publishdate)
5.Link RavenPack and Tiingo to PERMNO. Then for December 2021, rank stocks based on publishedDate/TIMESTAMP_UTC. What stocks (common shares e.g. HEXCD 10,11) are the newsy-est (i.e. highest count of news) in both datasets?
If you did a rank correlation, imputing 0s for missing, what is the correlation?
How many stocks are in the same top 10% of article count?
common_goods.count_tiingo_publishdate[ CREATE TABLE class_2024.count_tiingo_crawl ENGINE = MergeTree ORDER BY (permno, month)
SETTINGS allow_nullable_key = 1 AS
SELECT permno, toStartOfMonth(date(left(crawlDate, 10))+7) AS month,
toInt32(count()) AS n, comnam
FROM tingle.news
INNER JOIN crsp_202401.dsenames ON lower(news.ticker) = lower(dsenames.ticker)
WHERE (left(crawlDate, 10) >= namedt) AND (left(crawlDate, 10) <= nameendt)
GROUP BY ALL]
common_goods.count_rp_monthly[ create table class_2024.count_rp Engine=MergeTree ORDER BY (permno, month) SETTINGS allow_nullable_key = 1 AS select permno,toInt32(count()) as n,comnam,
toStartOfMonth(date(left(TIMESTAMP_UTC,10))+7) as month
from rpnew.full
inner join (select RP_ENTITY_ID,DATA_VALUE as cusip from rpnew.entity_mapping_full_edition where RANGE_END='' or RANGE_END>='2020-12-31' and DATA_TYPE='CUSIP') as mapping
on mapping.RP_ENTITY_ID=full.RP_ENTITY_ID
inner join crsp_202401.dsenames
on left(lower(mapping.cusip),8)=left(lower(dsenames.cusip),8)
where left(TIMESTAMP_UTC,10) between namedt and nameendt
group by all]
6.Suppose you run a hedge fund and you are considering making your entire hedge fund based on news-based strategies, and you would make 3% extra annualized for any strategy using RavenPack news. Suppose RavenPack costs $100,000 per year. What is the amount of extra AUM you would need to make it worthwhile, assuming no other considerations, under a typical 20% profit mandate?
7.Okay, now suppose you run a multistrat. How does your answer depend on correlations between news and other data sources? What happens if you have data sources Google Searches or Bloomberg searches for a ticker, which capture similarly, investor attention? Does this increase or decrease the willingness to pay? In the alpha/beta framework, how might that show up?
8.RavenPack’s NLP methods are “old school” … given its HFT clientele, should they adopt GPT4? Make a good argument for or against. What types of advantages might you get from using GPT4?
9.What types of disadvantages or concerns might there be with back-testing with GPT4? Hint: Consider its training set.
10.RavenPack has a new product called RavenPack Edge launched September 2021. What’s wrong with me going back in time to back-test with RavenPack Edge back to 2005?
https://conference.nber.org/confer/2013/MMf13/von_Beschwitz_Keim_Massa.pdf
11.Extra credit for portfolio sorting (4 points): Using the data provided, took at news attention (log articles) at t-1. First sort on momentum. Then sort on news (2 bins). Before doing any sorting, do prc >=5 and mcap bin >=3 or above. How does an attention portfolio with high RavenPack attention look versus low attention? How about Tiingo? Why does high or low attention predict stronger momentum?
12.Extra credit (2 points): What are some metrics you could create that are not currently present in the RavenPack dataset, or which are subpar and can be improved?
13.Extra credit for portfolio sorting (4 points): Come up with another strategy that, based on some attention or behavioral finance theory (or frankly, any other) and show me how news counts from RavenPack can be used to improve the strategy. Run said strategy. You can look at the database opensource2023 which has about 200+ factors implemented already.
请加QQ:99515681 邮箱:99515681@qq.com WX:codinghelp
- EF重启尼泊尔支教之旅项目
- WhatsApp营销软件,ws拉群业务/ws协议号/ws美国号/ws业务咨询大轩
- 全球营销:如何通过WhatsApp筛选器找到最具潜力的群众
- 第十届北京国际数字农业与灌溉技术博览会在京盛大开幕
- Telegram社交拉群神器,一键群发采集,助你轻松实现营销目标!
- TG/Telegram如何群发,电报群发工具购买/TG协议号群发推荐
- Yeelight易来宣布通过Works with Sonos认证,打造顶级智能声光体验
- 2024泰普尔新春发布会启航“泰”空征程,焕新睡眠科技震撼登场
- 胖东来超市成顶流,抓住四线小城高端人群
- 数字幻境之夜 科技魔法师的WhatsApp拉群营销工具分享 业务如梦如幻
- Ins高效群发软件,Instagram智能监控引流软件带你成为外贸专家!
- 超越期望 定制体验 WhatsApp拉群工具如何满足每位用户的个性诉求
- Blackjack编程代做、代写c/c++程序语言
- Instagram打粉营销软件,Ins引流助手,共同助你赢得市场!
- Discover Cutting-Edge Innovations at IPC APEX EXPO 2024 with ChipsX!
- 消息称苹果 Vision Pro 头显虚拟键盘体验拉胯,传统键盘仍是王道
- COMP 3027J代做、代写Python/Java程序
- instagram引流软件推广工具,ins群发引流系统推荐
- 眼科创新药物法瑞西单抗(罗视佳®)于北京爱尔英智眼科医院完成国内首批注射
- TPS2224ADBG4: Optimizing Power Management Efficiency with Dual Power-Distribution Switch | ChipsX
- Ins群发群控软件,Instagram多开云控群发,助你实现营销目标!
- 打通通信瓶颈:精讲Telegram协议号在网络中的角色!
- 掌上汽车:引领智能出行新风尚
- 跨境电商 Line 群发云控 虚拟商城的幻境
- 聊城物流网:连接世界的物流枢纽,助力城市腾飞
- 海伯森发布中国首款紫色激光对刀仪HPS-LCA100 | 开启更高精度CNC刀具测量
- tg群发软件,tg营销助手,手把手教学包学会天宇爆粉神器
- CE-Channel: Paving the Way for Brand Expansion Abroad with Tailored International Solutions
- WhatsApp营销软件,ws自动养号/ws协议号/ws群发/咨询大轩
- 从“稳中求进”的发展新战略,看碧桂园服务的长期主义 碧桂园服务:以长期主义走出稳健增长曲线
推荐
- 老杨第一次再度抓握住一瓶水,他由此产生了新的憧憬 瘫痪十四年后,老杨第一次再度抓握住一瓶水,他 科技
- 升级的脉脉,正在以招聘业务铺开商业化版图 长久以来,求职信息流不对称、单向的信息传递 科技
- 全力打造中国“创业之都”名片,第十届中国创业者大会将在郑州召开 北京创业科创科技中心主办的第十届中国创业 科技
- 疫情期间 这个品牌实现了疯狂扩张 记得第一次喝瑞幸,还是2017年底去北京出差的 科技
- 智慧驱动 共创未来| 东芝硬盘创新数据存储技术 为期三天的第五届中国(昆明)南亚社会公共安 科技
- 丰田章男称未来依然需要内燃机 已经启动电动机新项目 尽管电动车在全球范围内持续崛起,但丰田章男 科技
- 苹果罕见大降价,华为的压力给到了? 1、苹果官网罕见大降价冲上热搜。原因是苹 科技
- 如何经营一家好企业,需要具备什么要素特点 我们大多数人刚开始创办一家企业都遇到经营 科技
- 创意驱动增长,Adobe护城河够深吗? Adobe通过其Creative Cloud订阅捆绑包具有 科技
- B站更新决策机构名单:共有 29 名掌权管理者,包括陈睿、徐逸、李旎、樊欣等人 1 月 15 日消息,据界面新闻,B站上周发布内部 科技