Files
mmd/__pycache__/ground_news.cpython-314.pyc

184 lines
28 KiB
Plaintext
Raw Normal View History

2026-05-26 22:21:27 +02:00
+
<00><>j<>R<00><00>ja<00>R<>t90tRt^RIt^RIt^RIt^RIt^RIt^RIt^RI H
t
^RI H t H t ]
!]4PR, tRtRR<>RR<>/tRR R
R R R RR/t/RRbRRbRRbRRbRRbRRbRRbRRbR R!bR"R#bR$R%bR&R'bR(R)bR*R+bR,R-bR.R/bR0R1b/R2R3bR4R5bR6R7bR8R9bR:R;bR<R=bR>R?bR@RAbRBRCbRDREbRFRGbRHRIbRJRKbRLRMbRNRObRPRQbRRRSbCRTRURVRWRXRYRZR[R\R]R^R_R`RaRbRcRdReRfRgRhRi/ Ct]^kRjRkltR<>RlRmllt]P.!Rn4t]P.!Ro4t]P.!Rp4t]P.!Rq4tRrRsltRtRultRvRwltRxRyltR<>RzR{llt R<>R|R}llt!R~R/R<>R<>llt"R~R/R<>R<>llt#R<>R~RR<7F>^ /R<>R<>lllt$R<>R<EFBFBD>R<>llt%R<>R<EFBFBD>R<>llt&]'R<>8XEd<>^RI(t(]!4t ])!](PT4^8<>d{](PT^,R8Xdc](PT^,t+] R<>]+ 2t,]!] ],R4wt-t.]!]-]+4t/]0!R<>].'dR<>MR<> R<>24]0!]Pb!]/^RR<7F>74M<>])!](PT4^8<>d<>](PT^,R<>8Xdw](PT^,t+]#!]+] 4wt2t.]0!R<>].'dR<>MR<> R<>])!]24 R<>24]3!]2R<32>R<>R<EFBFBD>7F't4]0!R<>]4R<34>,R<> R<>]4R<34>,R<>, 24K) MHR<48>](PT9t5^t6]0!R<>])!]4 R<>]5 R<>24]$!] ]5R<35>7]%!] ^]6R<36>7t7]&!]7R<37>]6 R<>24] Pq4R#R#)<29>u<
ground_news.py — Ground News article fetcher + local SQLite store
Key design:
- RSC payload trick: send RSC: 1 header to get Next.js App Router data
- page_cache table: raw RSC payloads with TTL (don't re-fetch fresh pages)
- articles table: all extracted fields, categories merged across pages
- fetch_article(slug) — single article, rich data
- fetch_category(slug) — all stories on an interest page (~15 stories)
- fetch_all() — all known interest categories in parallel
- top_articles(n, days)— query DB for top-N by source_count
N)<01>Path)<02>get_conn<6E>DBConnzground_news.dbzhttps://ground.news<77>interest<73>articlez
User-Agentz2Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36<EFBFBD>Acceptztext/html,application/xhtml+xml<6D>RSC<53>1zNext-Router-State-TreezX%5B%22%22%2C%7B%22children%22%3A%5B%22__PAGE__%22%2C%7B%7D%5D%7D%2Cnull%2Cnull%2Ctrue%5D<35>europe<70>Europezeurope-economyzEurope Economyzeuropean-politicszEuropean Politicszeuropean-unionzEuropean Unionzeuropean-security-and-natozEuropean Security & NATOz uk-politicsz UK Politicszunited-kingdomzUnited Kingdom<6F> international<61> Internationalz north-americaz North Americaz south-americaz South America<63>africa<63>Africa<63>asia<69>Asia<69> australia<69> Australiaz us-politicsz US Politicsz united-statesz United Statesz donald-trumpz Donald Trumpztrump-administrationzTrump Administrationzisraeli-palestinian-conflictzIsraeli-Palestinian Conflictzbusiness-and-marketszBusiness & Marketszpremier-leaguezPremier League<75>soccer<65>Soccerz memorial-dayz Memorial Day<61>pharma<6D>Pharmaceuticals<6C>energy<67>Energyzrenewable-energyzRenewable Energy<67>denmark<72>Denmark<72>finance<63>Finance<63> corporate<74> Corporate<74>
technology<EFBFBD>
Technologyzclimate-changezClimate Change<67>shipping<6E>Shipping<6E>biotech<63>Biotech<63>
healthcare<EFBFBD>
Healthcare<EFBFBD>pharmaceutical<61>Pharmaceutical<61>nordic<69>Nordic<69> scandinavia<69> Scandinaviazdenmark-economyzDenmark Economyzdanish-economyzDanish Economyzglobal-economyzGlobal Economyzglobal-marketszGlobal Marketsz stock-marketz Stock Market<65> investing<6E> Investingz clean-energyz Clean Energy<67> logistics<63> Logistics<63>diabetes<65>Diabetesc<00>$<00>V^8<>dQhR\/#)<02><00>return)r)<01>formats"<22>./home/hjess/Projects/MoneyMaker/ground_news.py<70> __annotate__r9cs<00><00><16><16><06><16>c<04><00>\4#)zIReturn a DBConn wrapper (Postgres or SQLite). Schema is managed by db.py.)r<00>r:r8<00>get_dbr=cs
<00><00> <13>:<3A>r:c
<00>j<00>V^8<>dQhR\R\R\R\\\3,/#)r5<00>db<64>url<72> page_typer6)r<00>str<74>tuple<6C>bool)r7s"r8r9r9ls2<00><00><19><19>V<EFBFBD><19>#<23><19>#<23><19>u<EFBFBD>S<EFBFBD>RV<52>Y<EFBFBD>GW<47>r:c <04><><00>VPRV34P4p\PVR4p\ \
P
!44pV'd WSR,,
V8d VR,R3#\ P!V\R^R7pVP4VPRR.R
OWWVP34VP4VPR 3#) z@Return (content, from_cache). Re-fetches if stale per CACHE_TTL.z6SELECT content, fetched_at FROM page_cache WHERE url=?<3F><00>
fetched_at<EFBFBD>contentT)<03>headers<72>follow_redirects<74>timeout<75>
page_cacher@F)r@rArGrH) <0C>execute<74>fetchone<6E> CACHE_TTL<54>get<65>int<6E>time<6D>httpx<70>HEADERS<52>raise_for_status<75>upsert<72>text<78>commit)r?r@rA<00>row<6F>ttl<74>now<6F>rs&&& r8<00> fetch_cachedr]ls<><00><00>
<0C>*<2A>*<2A>@<40>3<EFBFBD>&<26> <06><0E>h<EFBFBD>j<EFBFBD><08> <14>-<2D>-<2D> <09>4<EFBFBD>
(<28>C<EFBFBD>
<0A>d<EFBFBD>i<EFBFBD>i<EFBFBD>k<EFBFBD>
<1A>C<EFBFBD>
<EFBFBD><03>,<2C>'<27>'<27>3<EFBFBD>.<2E><12>9<EFBFBD>~<7E>t<EFBFBD>#<23>#<23> <0A> <09> <09>#<23>w<EFBFBD><14>r<EFBFBD>J<>A<EFBFBD><05><16><16><18><06>I<EFBFBD>I<EFBFBD><14>e<EFBFBD>5<> <0C><13>f<EFBFBD>f<EFBFBD>%<25><06>
<07>I<EFBFBD>I<EFBFBD>K<EFBFBD> <0C>6<EFBFBD>6<EFBFBD>5<EFBFBD>=<3D>r:zC[0-9a-f]{8}-[0-9a-f]{4}-4[0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}z<>"blindspotData":\{[^}]{0,400}"leftPercent":([\d.]+),"rightPercent":([\d.]+),"centerPercent":([\d.]+),"leftSrcCount":(\d+),"rightSrcCount":(\d+),"cntrSrcCount":(\d+)zn"start":"(20\d{2}-[^"]+)","title":"([^"]{10,200})","slug":"([a-z0-9][a-z0-9_-]{15,})","factuality":\{([^}]+)\}z"((?:[^"\\]|\\.)*)"c<00>0<00>V^8<>dQhR\R\/#)r5<00>sr6<00>rB)r7s"r8r9r9<00>s<00><00><11><11>s<EFBFBD><11>s<EFBFBD>r:c<04>^<00>\P!RV R24# \dTu#i;i)z#Decode a JSON-escaped string value.<2E>")<03>json<6F>loads<64> Exception)r_s&r8<00>_decoderf<00>s2<00><00><11><13>z<EFBFBD>z<EFBFBD>A<EFBFBD>a<EFBFBD>S<EFBFBD><01>(<28>#<23>#<23><> <14><11><10><08><11>s <00><00> ,<03>,c<00>R<00>V^8<>dQhR\R\R\\,/#)r5<00>data<74>categoryr6)rB<00>list<73>dict)r7s"r8r9r9<00>s*<00><00>G<13>G<13><03>G<13>s<EFBFBD>G<13>t<EFBFBD>D<EFBFBD>z<EFBFBD>Gr:c <04>p<00>.p\PV4EF<>pVP^4VP^4VP^4VP^43wrErgV\^VP 4R,
4VP 4pWP 4VP 4R,p \ PVRR4p
V
'd
V
R,MRp \PVR R4p R;p ;r<>R;p;ppV 'd<>\V P^44p \V P^44p\V P^44p\V P^44p\V P^44p\V P^44p\P!RVR R4pV'd\VP^44M^p\P!RVR R4pV'd\VP^44MRp\P!RVR R4pV'dVP^4MRp\P!RVR!R4pV'd\VP^44MRp\P!RV 4pV'd\VP^44M^p\P!R V4UUu/uFwppV\V4bK ppp\P!R
VR"R4p V 'd%\ PV P^44M.p!VP/R VbR V bR \V4bRVbRVR,bRVbRVbRV bRVbRVbRVbRVbRVbRVbRVbRVbRV!bRV/C4EK<> V#uuppi)#z.Extract all story objects from an RSC payload.i@ipN<>"biasSourceCount":(\d+)<29>"overallBias":([-\d.]+)<29>&"blindspot":"(left|right|center|none)"z'"description":"((?:[^"\\]|\\.){0,600})"<22>"sourceCount":(\d+)<29> "(\w+)":(\d+)z"interests":\[([^\]]*)\]<5D>slug<75>story_id<69>title<6C> description<6F>
start_date:N<>
N<> source_count<6E>bias_src_count<6E>left_pct<63>ctr_pct<63> right_pct<63>left_src_count<6E> ctr_src_count<6E>right_src_count<6E> overall_bias<61> blindspot<6F>
factuality<EFBFBD> interestsrii`<60><><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD>i<EFBFBD><69><EFBFBD><EFBFBD>iH<69><48><EFBFBD>i0<69><30><EFBFBD>)<0F>_STORY<52>finditer<65>group<75>max<61>start<72>end<6E>_UUID<49>findall<6C>_BLIND<4E>search<63>floatrQ<00>rerf<00>append)"rhri<00>stories<65>mr<6D>rtrr<00>fact_raw<61>before<72>after<65>uuidsrs<00>blindrzr|r{<00>left_cnt<6E> right_cnt<6E>ctr_cnt<6E>bscry<00>obr<62><00>bsr<73><00>desc_mru<00>scrx<00>k<>vr<76><00>int_mr<6D>s"&& r8<00> parse_storiesr<73><00>s<><00><00><10>G<EFBFBD> <13>_<EFBFBD>_<EFBFBD>T<EFBFBD> "<22><01>'(<28>w<EFBFBD>w<EFBFBD>q<EFBFBD>z<EFBFBD>1<EFBFBD>7<EFBFBD>7<EFBFBD>1<EFBFBD>:<3A>q<EFBFBD>w<EFBFBD>w<EFBFBD>q<EFBFBD>z<EFBFBD>1<EFBFBD>7<EFBFBD>7<EFBFBD>ST<53>:<3A>'U<>$<24><05>d<EFBFBD><15>c<EFBFBD>!<21>Q<EFBFBD>W<EFBFBD>W<EFBFBD>Y<EFBFBD><14>-<2D>.<2E><01><07><07> <09>:<3A><06><15>e<EFBFBD>e<EFBFBD>g<EFBFBD>q<EFBFBD>u<EFBFBD>u<EFBFBD>w<EFBFBD><14>~<7E>.<2E><05><16> <0A> <0A>f<EFBFBD>U<EFBFBD>V<EFBFBD>n<EFBFBD>-<2D><05> %<25>5<EFBFBD><12>9<EFBFBD>4<EFBFBD><08><17> <0A> <0A>f<EFBFBD>U<EFBFBD>V<EFBFBD>n<EFBFBD>-<2D><05>)-<2D>-<2D><08>-<2D>9<EFBFBD>)-<2D>-<2D><08>-<2D>9<EFBFBD>w<EFBFBD> <10><1D>e<EFBFBD>k<EFBFBD>k<EFBFBD>!<21>n<EFBFBD>-<2D>H<EFBFBD><1D>e<EFBFBD>k<EFBFBD>k<EFBFBD>!<21>n<EFBFBD>-<2D>I<EFBFBD><1D>e<EFBFBD>k<EFBFBD>k<EFBFBD>!<21>n<EFBFBD>-<2D>G<EFBFBD><1B>E<EFBFBD>K<EFBFBD>K<EFBFBD><01>N<EFBFBD>+<2B>H<EFBFBD><1B>E<EFBFBD>K<EFBFBD>K<EFBFBD><01>N<EFBFBD>+<2B>I<EFBFBD><1B>E<EFBFBD>K<EFBFBD>K<EFBFBD><01>N<EFBFBD>+<2B>G<EFBFBD><11>i<EFBFBD>i<EFBFBD>2<>F<EFBFBD>5<EFBFBD>6<EFBFBD>N<EFBFBD>C<><03>.1<EFBFBD><13>S<EFBFBD>Y<EFBFBD>Y<EFBFBD>q<EFBFBD>\<5C>*<2A>q<EFBFBD><0E><10>Y<EFBFBD>Y<EFBFBD>1<>6<EFBFBD>%<25>&<26>><3E> B<><02>-/<2F>u<EFBFBD>R<EFBFBD>X<EFBFBD>X<EFBFBD>a<EFBFBD>[<5B>)<29>T<EFBFBD> <0C><10>Y<EFBFBD>Y<EFBFBD>@<40>&<26><15><16>.<2E> Q<><02>#%<25>B<EFBFBD>H<EFBFBD>H<EFBFBD>Q<EFBFBD>K<EFBFBD>4<EFBFBD> <09><14><19><19>E<>v<EFBFBD>e<EFBFBD>f<EFBFBD>~<7E>V<><06>28<32>g<EFBFBD>f<EFBFBD>l<EFBFBD>l<EFBFBD>1<EFBFBD>o<EFBFBD>.<2E>d<EFBFBD> <0B><10>Y<EFBFBD>Y<EFBFBD>-<2D>u<EFBFBD> 5<><02>+-<2D>s<EFBFBD>2<EFBFBD>8<EFBFBD>8<EFBFBD>A<EFBFBD>;<3B>'<27>1<EFBFBD> <0C>-/<2F>J<EFBFBD>J<EFBFBD>7G<37><18>,R<>S<>,R<>D<EFBFBD>A<EFBFBD>q<EFBFBD>a<EFBFBD><13>Q<EFBFBD><16>i<EFBFBD>,R<>
<EFBFBD>S<><13> <09> <09>5<>v<EFBFBD>e<EFBFBD>f<EFBFBD>~<7E>F<><05>5:<3A>E<EFBFBD>M<EFBFBD>M<EFBFBD>%<25>+<2B>+<2B>a<EFBFBD>.<2E>1<><02> <09><0F><0E><0E>
<EFBFBD> <12>d<EFBFBD>
<EFBFBD> <16>h<EFBFBD>
<EFBFBD> <14>g<EFBFBD>e<EFBFBD>n<EFBFBD>
<EFBFBD> <1A>k<EFBFBD> 
<EFBFBD>
<19>e<EFBFBD>C<EFBFBD>j<EFBFBD> 
<EFBFBD> <1B>l<EFBFBD> 
<EFBFBD> <1D>n<EFBFBD>
<EFBFBD> <17>h<EFBFBD>
<EFBFBD> <16>g<EFBFBD>
<EFBFBD> <18>i<EFBFBD>
<EFBFBD> <1D>h<EFBFBD>
<EFBFBD> <1C>g<EFBFBD>
<EFBFBD> <1E>i<EFBFBD>
<EFBFBD> <1B>l<EFBFBD>
<EFBFBD> <18>i<EFBFBD>
<EFBFBD> <19>j<EFBFBD>!
<EFBFBD>" <18>i<EFBFBD>#
<EFBFBD>$ <17>h<EFBFBD>%
<EFBFBD> <0B>a#<23>H <13>N<EFBFBD><4E>5Ts<00>0N2c<00><<00>V^8<>dQhR\R\R\/#)r5rhrrr6)rBrk)r7s"r8r9r9<00>s!<00><00>6<06>6<06>s<EFBFBD>6<06>#<23>6<06>$<24>6r:c <04>>a<00>\3V3Rllp\P!R\P!V4,S\P4pV'dVP ^4MV!R4p\P!RS\P4pV'd\ VP ^44MV!R4p\PS4p/pR$Fop \P!RV R2S\P4p
V
'gK7R\V
P ^44R \V
P ^44/W<>&Kq \P!R
S4p V 'dE\P!R V P ^44U U u/uFwr<>V \V 4bK up p M/p\P!R S4p/R VbRVbRVbRV'd\ VP ^44MRbRV!R4bRV!R\4bRV!R\4bRV!R\4bRV!R4bRV'd\VP ^44MRbRV'd\VP ^44MRbRV'd\VP ^44MRbRV'd\VP ^44MRbR V'd\VP ^44MRbR!V'd\VP ^44MRbR"VbR#Vb#uup p i)%zDRicher extraction for a single article page (has wireStoryRefs etc).c<00><><<01>\P!VS4pV'dV!VP^44#R# \dR#i;i)<02>N)r<>r<>r<>re)<04>pattern<72>castr<74>rhs&& <20>r8rP<00>!parse_single_article.<locals>.get<65>sD<00><><00> <0E>I<EFBFBD>I<EFBFBD>g<EFBFBD>t<EFBFBD> $<24><01> <18>'(<28>4<EFBFBD><01><07><07><01>
<EFBFBD>#<23> 2<>d<EFBFBD> 2<><32><18> <18><17> <18>s<00>;<00>;<00>;<00> A
<03> A
z)"id":"([0-9a-f-]{36})"[^}]{0,200}"slug":"z"id":"([0-9a-f-]{36})"z2"title":"([^"]{10,200})"[^}]{0,100}"wireStoryRefs"z"title":"([^"]{10,200})"z"id":"z)".*?"sourceCount":(\d+).*?"percent":(\d+)<29>sources<65>percentz"factuality":\{([^}]+)\}rq<00>("description":"((?:[^"\\]|\\.){20,600})"rrrsrtruNrvz"start":"(20\d{2}-[^"]+)"rxrpryrmr<>rnr<>rorzr|r{r}rr~r<><00>bias_breakdown)<03>left<66>center<65>right) rBr<>r<><00>escape<70>DOTALLr<4C>rfr<>rQr<>r<>)rhrrrP<00>id_mrs<00>title_mrtr<>r<><00>side<64>bm<62>fmr<6D>r<>r<>r<>sf& r8<00>parse_single_articler<65><00>s<><00><><00><1D><18> <0E>9<EFBFBD>9<EFBFBD>A<>B<EFBFBD>I<EFBFBD>I<EFBFBD>d<EFBFBD>O<EFBFBD>S<>UY<55>[]<5D>[d<>[d<> e<>D<EFBFBD> $<24>t<EFBFBD>z<EFBFBD>z<EFBFBD>!<21>}<7D>#<23>.G<>*H<>H<EFBFBD><11>i<EFBFBD>i<EFBFBD>M<>t<EFBFBD>UW<55>U^<5E>U^<5E>_<>G<EFBFBD>)0<>G<EFBFBD>G<EFBFBD>M<EFBFBD>M<EFBFBD>!<21>$<24> %<25>c<EFBFBD>:U<>6V<36>E<EFBFBD> <13>M<EFBFBD>M<EFBFBD>$<24> <1F>E<EFBFBD><18>N<EFBFBD>+<2B><04> <0F>Y<EFBFBD>Y<EFBFBD><15>d<EFBFBD>V<EFBFBD>D<> E<> <10>"<22>)<29>)<29>
<EFBFBD><02> <0E>2<EFBFBD>$-<2D>s<EFBFBD>2<EFBFBD>8<EFBFBD>8<EFBFBD>A<EFBFBD>;<3B>/?<3F><19>C<EFBFBD>PR<50>PX<50>PX<50>YZ<59>P[<5B>L\<5C>#]<5D>N<EFBFBD> <20> ,<2C>
<0C><19><19>.<2E><04> 5<>B<EFBFBD>VX<56><02>
<EFBFBD>
<EFBFBD>3C<EFBFBD>R<EFBFBD>X<EFBFBD>X<EFBFBD>a<EFBFBD>[<5B>(Q<>R<>(Q<><04><01>!<21>S<EFBFBD><11>V<EFBFBD>)<29>(Q<>R<>^`<60>J<EFBFBD> <0F>Y<EFBFBD>Y<EFBFBD>B<>D<EFBFBD> I<>F<EFBFBD> <06><0E>4<EFBFBD> <06><12>8<EFBFBD> <06> <10>5<EFBFBD> <06> <16>v<EFBFBD>7<EFBFBD>6<EFBFBD><<3C><<3C><01>?<3F>3<>4<EFBFBD>  <06>
<15>3<EFBFBD>;<3B><<3C>  <06> <17>3<EFBFBD>5<>s<EFBFBD>;<3B>  <06> <19>3<EFBFBD>9<>3<EFBFBD>?<3F> <06> <17>3<EFBFBD>9<>5<EFBFBD>A<> <06> <14>3<EFBFBD>H<>I<> <06> <13>E<EFBFBD>5<EFBFBD><15><1B><1B>Q<EFBFBD><1E>0<>t<EFBFBD> <06> <14>E<EFBFBD>5<EFBFBD><15><1B><1B>Q<EFBFBD><1E>0<>t<EFBFBD> <06> <12>E<EFBFBD>5<EFBFBD><15><1B><1B>Q<EFBFBD><1E>0<>t<EFBFBD> <06> <19>%<25>3<EFBFBD>u<EFBFBD>{<7B>{<7B>1<EFBFBD>~<7E>.<2E>T<EFBFBD> <06> <1A>%<25>3<EFBFBD>u<EFBFBD>{<7B>{<7B>1<EFBFBD>~<7E>.<2E>T<EFBFBD> <06> <18>%<25>3<EFBFBD>u<EFBFBD>{<7B>{<7B>1<EFBFBD>~<7E>.<2E>T<EFBFBD> <06> <15>:<3A>! <06>" <19>><3E># <06><06><> Ss<00>Lc<00>R<00>V^8<>dQhR\R\\,R\/#)r5r?r<>r6)rrjrkrQ)r7s"r8r9r9*s%<00><00>6<0F>6<0F><06>6<0F><14>d<EFBFBD><1A>6<0F><03>6r:c<04>~<00>\\P!44p^pVEFpVPRVR,34P4pV'd,\ VR,;'gRP R44M \ 4pVP R4VPVR,4V'd<>VPRVR,VR ,VR
,VR ,VR ,VR ,VR,VR,VR,VR,VR,VR,RP\V44VVR,34EK1VPRVR,VR,VR,VR,VR,VR ,VR
,VR ,VR ,VR ,VR,VR,VR,VR,VR,\P!VR,4\P!VR,4VR,W"34V^, pEK VP4V#)zAInsert new / update existing articles. Returns count of new rows.z8SELECT categories, first_seen FROM articles WHERE slug=?rr<00>
categories<EFBFBD><00>,ria<>UPDATE articles SET
story_id=COALESCE(story_id, ?),
source_count=?, bias_src_count=?,
left_pct=?, ctr_pct=?, right_pct=?,
left_src_count=?, ctr_src_count=?, right_src_count=?,
overall_bias=?, blindspot=?,
description=COALESCE(description, ?),
categories=?, last_seen=?
WHERE slug=?rsrxryrzr{r|r}r~rr<>r<>rua<>INSERT INTO articles
(slug, story_id, title, description, start_date,
source_count, bias_src_count,
left_pct, ctr_pct, right_pct,
left_src_count, ctr_src_count, right_src_count,
overall_bias, blindspot,
factuality_json, interests_json,
categories, first_seen, last_seen)
VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)rtrvr<>r<>) rQrRrMrN<00>set<65>split<69>discard<72>add<64>join<69>sortedrc<00>dumpsrX)r?r<>r[<00>newr_rY<00>catss&& r8<00>upsert_articlesr<73>*s<><00><00> <0E>t<EFBFBD>y<EFBFBD>y<EFBFBD>{<7B> <1B>C<EFBFBD> <0C>C<EFBFBD> <14><01><10>j<EFBFBD>j<EFBFBD> F<><11>6<EFBFBD><19> <0C>
<EFBFBD>
<12>(<28>*<2A> <0C>=@<01>s<EFBFBD>C<EFBFBD> <0C>%<25>+<2B>+<2B><12>2<>2<>3<EFBFBD>7<>8<>S<EFBFBD>U<EFBFBD><04> <0C> <0C> <0C>R<EFBFBD><18> <0C><08><08><11>:<3A><1D><1F> <0E> <0E>J<EFBFBD>J<EFBFBD>#<23><13>:<3A><1D><12>><3E>"<22>A<EFBFBD>&6<>$7<><12>:<3A><1D><01>)<29> <0C>a<EFBFBD> <0B>n<EFBFBD><12>#<23>$<24>a<EFBFBD><0F>&8<>!<21><M<>:N<><12>><3E>"<22>A<EFBFBD>k<EFBFBD>N<EFBFBD><12>=<3D>!<21><14><18><18>&<26><14>,<2C>'<27><13><12>6<EFBFBD><19><1C> <0E>( <0F>J<EFBFBD>J<EFBFBD>G<01><13>6<EFBFBD><19>A<EFBFBD>j<EFBFBD>M<EFBFBD>1<EFBFBD>W<EFBFBD>:<3A>q<EFBFBD><1D>7G<37><11><<3C><1F><12>><3E>"<22>A<EFBFBD>&6<>$7<><12>:<3A><1D><01>)<29> <0C>a<EFBFBD> <0B>n<EFBFBD><12>#<23>$<24>a<EFBFBD><0F>&8<>!<21><M<>:N<><12>><3E>"<22>A<EFBFBD>k<EFBFBD>N<EFBFBD><15><1A><1A>A<EFBFBD>l<EFBFBD>O<EFBFBD>,<2C>d<EFBFBD>j<EFBFBD>j<EFBFBD><11>;<3B><1E>.H<><12>:<3A><1D><03> *<2A> <0E>$ <10>1<EFBFBD>H<EFBFBD>C<EFBFBD>a<15>b<07>I<EFBFBD>I<EFBFBD>K<EFBFBD> <0E>Jr:c<00>J<00>V^8<>dQhR\R\R,R\/#<00>r5rrr?Nr6)rBr)r7s"r8r9r9gs%<00><00>*<1B>*<1B>S<EFBFBD>*<1B>f<EFBFBD>t<EFBFBD>m<EFBFBD>*<1B>s<EFBFBD>*r:c<04>&a
a <0B>VRJpV'd \4p\ RV 2p\WR4wrEV'dVP4.o
\ 4o RV
V 3Rllp\
P !RV4FSp\VP^44p\
P!RV\
P4'dKKV!V4KU R Fpp \
P !W<>4FSp\VP^44p\
P!RV\
P4'dKKV!V4KU Kr \
P !R V4F#pV!\VP^444K% R
PS
4#) z<>
Fetch full article RSC payload and return a clean text blob for NLP.
Extracts: main title + description + all source article headlines.
N<EFBFBD> /article/rc<00>(<00>V^8<>dQhR\RR/#)r5rWr6Nr`)r7s"r8r9<00>(fetch_article_text.<locals>.__annotate__ws<00><00><1F><1F>#<23><1F>$<24>r:c<00><><<02>V'd@\V4^8<>d.VS9d%SPV4SPV4R#R#R#R#)<02>N)<03>lenr<6E>r<>)rW<00>parts<74>seens&<26><>r8r<><00>fetch_article_text.<locals>.addws7<00><><00> <0F>C<EFBFBD><04>I<EFBFBD><02>N<EFBFBD>t<EFBFBD>4<EFBFBD>'7<> <10>H<EFBFBD>H<EFBFBD>T<EFBFBD>N<EFBFBD> <11>L<EFBFBD>L<EFBFBD><14> <1E>(8<>N<EFBFBD>4r:z""title":"((?:[^"\\]|\\.){10,300})"zGetty|AFP|\/AFP|PHOTO-TAGzGetty|AFP|PHOTO-TAG|Author:z%"headline":"((?:[^"\\]|\\.){10,300})"<22> )r<>z$"excerpt":"((?:[^"\\]|\\.){20,400})"z$"summary":"((?:[^"\\]|\\.){20,400})") r=<00>BASE_URLr]<00>closer<65>r<>r<>rfr<>r<><00>Ir<49>) rrr?<00>own_dbr@rh<00>_r<5F>r<><00>tr<74>r<>r<>s && @@r8<00>fetch_article_textr<74>gs2<00><><00>
<10>4<EFBFBD>Z<EFBFBD>F<EFBFBD> <0A> <13>X<EFBFBD><02><16>Z<EFBFBD>y<EFBFBD><14><06> '<27>C<EFBFBD><1A>2<EFBFBD>I<EFBFBD>.<2E>G<EFBFBD>D<EFBFBD> <0A>
<EFBFBD><08><08>
<EFBFBD><19>E<EFBFBD><18>U<EFBFBD>D<EFBFBD><1F><1F> <10>[<5B>[<5B>><3E><04> E<><01> <13>A<EFBFBD>G<EFBFBD>G<EFBFBD>A<EFBFBD>J<EFBFBD> <1F><01><11>y<EFBFBD>y<EFBFBD>5<>q<EFBFBD>"<22>$<24>$<24>?<3F>?<3F> <0F><01>F<EFBFBD>F<01> <06><07>
<14><1B><1B>W<EFBFBD>+<2B>A<EFBFBD><17><01><07><07><01>
<EFBFBD>#<23>A<EFBFBD><15>9<EFBFBD>9<EFBFBD>;<3B>Q<EFBFBD><02><04><04>E<>E<><13>A<EFBFBD><06>,<2C> <06><10>[<5B>[<5B>A<>4<EFBFBD> H<><01> <0B>G<EFBFBD>A<EFBFBD>G<EFBFBD>G<EFBFBD>A<EFBFBD>J<EFBFBD> <1F> <20>I<01> <0F>8<EFBFBD>8<EFBFBD>E<EFBFBD>?<3F>r:c<00>J<00>V^8<>dQhR\R\R,R\/#r<>)rBrrk)r7s"r8r9r9<00>s%<00><00>
<12>
<12><03>
<12><16>$<24><1D>
<12>$<24>
r:c<04><><00>VRJpV'd \4p\ RV 2p\WR4wrE\W@4pV'dVP 4V#)z<Fetch a single article page; optionally cache + store in DB.Nr<4E>r)r=r<>r]r<>r<>)rrr?r<>r@rhr<><00>results&& r8<00> fetch_articler<65><00>sN<00><00> <0F>4<EFBFBD>Z<EFBFBD>F<EFBFBD> <0A> <13>X<EFBFBD><02><16>Z<EFBFBD>y<EFBFBD><14><06> '<27>C<EFBFBD><1A>2<EFBFBD>I<EFBFBD>.<2E>G<EFBFBD>D<EFBFBD>#<23>D<EFBFBD>/<2F>F<EFBFBD> <0A>
<EFBFBD><08><08>
<EFBFBD> <11>Mr:<00>forceFc <00>~<00>V^8<>dQhR\R\R\\\\,\3,/#)r5<00> category_slugr<67>r6)rBrDrCrjrk)r7s"r8r9r9<00>s8<00><00>.<2E>.<2E><16>.<2E> <10>.<2E> <0B>3<EFBFBD><04>T<EFBFBD>
<EFBFBD>D<EFBFBD> <20>!<21> .r:c<04><><00>\4p\ RV 2pV'd$VPRV34VP4\ W#R4wrEVP 4\ W@4pWV3#)z<>
Fetch one category page via HTTP only.
Uses a per-thread DB connection (psycopg2 connections are not thread-safe).
Returns (slug, stories, from_cache).
z
/interest/z"DELETE FROM page_cache WHERE url=?r)rr<>rMrXr]r<>r<>)r<>r<>r?r@rh<00>
from_cacher<EFBFBD>s&$ r8<00>_http_fetch_categoryr<79><00>se<00><00>
<12><1A>B<EFBFBD> <15>J<EFBFBD>j<EFBFBD><1D><0F>
0<EFBFBD>C<EFBFBD> <0C>
<EFBFBD>
<EFBFBD>
<EFBFBD>7<>#<23><16>@<40>
<EFBFBD> <09> <09> <0B>#<23>B<EFBFBD>Z<EFBFBD>8<><14>D<EFBFBD><06>H<EFBFBD>H<EFBFBD>J<EFBFBD><1B>D<EFBFBD>0<>G<EFBFBD> <18>:<3A> -<2D>-r:c
<00><><00>V^8<>dQhR\R\R\R\\\
,\3,/#)r5r<>r?r<>r6)rBrrDrCrjrk)r7s"r8r9r9<00>s><00><00> <1F> <1F><16> <1F><0E> <1F> <10> <1F>
 <0B>4<EFBFBD><04>:<3A>t<EFBFBD> <1B><1C> r:c<04><<00>\WR7wr4p\W4WE3#)zA
Fetch an interest category page.
Returns (stories, from_cache).
<EFBFBD>r<>)r<>r<>)r<>r?r<>r<>r<>r<>s&&$ r8<00>fetch_categoryr<79><00>s$<00><00>2<>-<2D>M<><1A>A<EFBFBD>
<EFBFBD><13>B<EFBFBD> <20> <12> <1E>r:<00>workersc <00><><00>V^8<>dQhR\R\\,R,R\R\R\
\\\
,3,/#)r5r?<00>slugsNr<4E>r<>r6)rrjrBrDrQrk)r7s"r8r9r9<00>sP<00><00><13><13><0E><13> <0F><03>9<EFBFBD>t<EFBFBD> <1B><13> <10> <13>
<11> <13> 
<EFBFBD>#<23>t<EFBFBD>D<EFBFBD>z<EFBFBD>/<2F><1A> r:c <04><><00>T;'g\\P44p/p\PP VR7;_uu_4pVUu/uFqvP \WrR7VbK pp\PPV4FZp W<>,p
V P4wr<>p \W 4W<>V
&V 'dRMRp\RV RV
R R\V 4R R 24K\ R R R 4V#uupi \d"p\R
T
R R T 24.YZ&R p?K<>R p?ii;i +'giT#;i) z<>
Fetch all (or given) interest categories in parallel (HTTP only),
then upsert results serially into DB from the calling thread.
Returns {slug: [story, ...]} mapping.
)<01> max_workersr<73>u💾u🌐<> r<>z<38<33>2z storiesu ✗ z ERROR: N)rj<00>KNOWN_INTERESTS<54>keys<79>
concurrent<EFBFBD>futures<65>ThreadPoolExecutor<6F>submitr<74><00> as_completedr<64>r<><00>printr<74>re)r?r<>r<>r<><00>targets<74>results<74>exr_<00>futs<74>frrr<>r<><00>cached<65>icon<6F>es&&$$ r8<00> fetch_allr<00>s.<00><00><14>3<>3<>t<EFBFBD>O<EFBFBD>0<>0<>2<>3<>G<EFBFBD>%'<27>G<EFBFBD> <13> <1B> <1B> .<2E> .<2E>7<EFBFBD> .<2E> C<> C<>r<EFBFBD>LS<4C>T<>G<EFBFBD>q<EFBFBD> <09> <09>.<2E><01> <09>?<3F><11>B<>G<EFBFBD><04>T<><1B>#<23>#<23>0<>0<><14>6<>A<EFBFBD><17>7<EFBFBD>D<EFBFBD> #<23>%&<26>X<EFBFBD>X<EFBFBD>Z<EFBFBD>"<22><01>F<EFBFBD><1F><02>,<2C> '<27><04> <0A>!'<27>v<EFBFBD>V<EFBFBD><04><15><02>4<EFBFBD>&<26><01>$<24>s<EFBFBD><1A>1<EFBFBD>S<EFBFBD><17>\<5C>!<21>,<<3C>H<EFBFBD>E<>F<>7<>
D<01> <13>N<EFBFBD><4E>U<01><><1D> #<23><15><06>t<EFBFBD>C<EFBFBD>j<EFBFBD><08><11><03>4<>5<> "<22><07> <0A><> #<23><>
D<01> C<> <13>N<EFBFBD>sI<00>D4<05> D<08>7,D4<05>$AD<06>2D4<05>D4<05> D1 <09>D, <09>&D4<05>,D1 <09>1D4<05>4 E c <00><><00>V^8<>dQhR\R\R\R,R\R\\P,/#)r5r?<00>limit<69>daysN<73> min_sourcesr6)rrQrj<00>sqlite3<65>Row)r7s"r8r9r9<00>sG<00><00><11><11><0E><11> <0E><11> <0E><04>*<2A><11><15> <11>

<EFBFBD>'<27>+<2B>+<2B><16> r:c<04><><00>RpV.pVeVR, pVPRV R24VPRV R2.VOVN54P4#)z*Query DB for top articles by source_count.zWHERE source_count >= ?z! AND start_date >= date('now', ?)<29>-z dayszSELECT * FROM articles z# ORDER BY source_count DESC LIMIT ?)r<>rM<00>fetchall)r?rrr<00>where<72>paramss&&&& r8<00> top_articlesr<00>si<00><00> &<26>E<EFBFBD><1F>=<3D>F<EFBFBD> <0B><17> <0A>4<>4<><05><0E> <0A> <0A><01>$<24><16>u<EFBFBD>o<EFBFBD>&<26> <0A>:<3A>:<3A>
!<21>%<25><17>(K<>L<><18>&<26><18>%<25><18> <06><0F>h<EFBFBD>j<EFBFBD>r:c<00>^<00>V^8<>dQhR\\P,R\RR/#)r5<00>rows<77>headerr6N)rjrr rB)r7s"r8r9r9<00>s*<00><00><10><10>D<EFBFBD><17><1B><1B>%<25><10>s<EFBFBD><10><04>r:c <00><><00>\RR& 24\RV R\V4 R24\R& R24\V^4EF1wr#RpVR,e%RVR,R R VR
,R R VR ,R R 2pVR,;'gRPRR4pVR,eRVR,R 2MRpVR,'d RVR, 2MRp\VR RVR,R RV V V RVR, R2
4\RVR,R , 24VR!,'d\RVR!,R", 24\R#V R24\R$VR%, 24\4EK4 R#)'<27>
r<EFBFBD>z (z
artikler)r<>rzNz Lz.0fz% Cr{z% Rr|<00>%r<>r<>u · r<>z bias=z+.2fr<EFBFBD>z blindspot=r<>z. [rx<00>4z srcz] [rv<00>]z rt:N<>PNru:N<>ZNz [z /article/rrzL============================================================================)r<>r<><00> enumerate<74>replace)rr<00>i<>a<>biasr<73>r<>r<>s&& r8<00> print_topr<00>s<><00><00> <09>B<EFBFBD>v<EFBFBD>h<EFBFBD>-<2D><18> <09>B<EFBFBD>v<EFBFBD>h<EFBFBD>c<EFBFBD>#<23>d<EFBFBD>)<29><1B>J<EFBFBD>
/<2F>0<> <09>V<EFBFBD>H<EFBFBD>B<EFBFBD>-<2D><18><19>$<24><01>"<22><04><01><11><04> <0C>Z<EFBFBD>=<3D> $<24><18><11>:<3A><1D>s<EFBFBD>+<2B>3<EFBFBD>q<EFBFBD><19>|<7C>C<EFBFBD>.@<40><03>A<EFBFBD>k<EFBFBD>N<EFBFBD>SV<53>CW<43>WX<57>Y<>D<EFBFBD><11>,<2C><0F>%<25>%<25>2<EFBFBD>.<2E>.<2E>s<EFBFBD>F<EFBFBD>;<3B><04>56<35>~<7E>5F<35>5R<35><17><11>><3E>*<2A>4<EFBFBD>0<>1<>XZ<58><02>23<32>K<EFBFBD>.<2E>.<2E><1C>a<EFBFBD> <0B>n<EFBFBD>-<2D>.<2E>b<EFBFBD><02> <0A><11>1<EFBFBD><05>S<EFBFBD><11>><3E>*<2A>1<EFBFBD>-<2D>T<EFBFBD>$<24><16><02>t<EFBFBD>B<EFBFBD>4<EFBFBD>s<EFBFBD>1<EFBFBD>\<5C>?<3F>BS<42>ST<53>U<>V<> <0A><04>Q<EFBFBD>w<EFBFBD>Z<EFBFBD><03>_<EFBFBD>%<25>&<26>'<27> <0C>]<5D> <1B> <1B> <11>D<EFBFBD><11>=<3D>)<29>#<23>.<2E>/<2F>0<> 1<> <0A><05>d<EFBFBD>V<EFBFBD>1<EFBFBD>o<EFBFBD><1E> <0A> <0A>a<EFBFBD><06>i<EFBFBD>[<5B>)<29>*<2A> <0A><07>#r:<00>__main__r<5F><00>(r<00>fetched<65>))<02>indent<6E> ensure_asciiriz) z stories
c<00><00>VR,#)rxr<)<01>xs&r8<00><lambda>r($s <00><00>q<EFBFBD><1E>/@r:T)<02>key<65>reversez [rxrz src] rt:N<>FNz--forcez Fetching all z categories (force=u)…
r<EFBFBD>)rruTop 30 seneste z dagec<00>b<00>V^8<>dQh/^\9d\\\3,;R&#)r5r<>)<03>__conditional_annotations__rkrB)r7s"r8r9r9s,<00><00> <04> <04>X0<02>0<02><14>c<EFBFBD>3<EFBFBD>h<EFBFBD><1E>0<02>Y r:rFi`T)r)N)<03>r5<00>)z Top artikler):r-<00>__doc__r<5F>rcrRrrS<00>concurrent.futuresr<73><00>pathlibrr?rr<00>__file__<5F>parent<6E>DB_PATHr<48>rOrTr<>r=r]<00>compiler<65>r<>r<><00> _JSON_STRrfr<>r<>r<>r<>r<>r<>r<>rrr<00>__name__<5F>sysr<73><00>argvrrr@rhrr<>r<>r<>r<>r<>r_r<>rrr<>r9)r-s@r8<00><module>r;s<><00><><01> <04>
<EFBFBD> <0B> <0B><0E> <0C><19><18><1F> <11><18>N<EFBFBD> !<21> !<21>$4<> 4<><07> !<21><08><0F><07> <0A> <0B> <02> <09> <11>F<> <0C>/<2F> <09>3<EFBFBD><1C>b<>  <02><07>0#<02> <0C>H<EFBFBD>0#<02><14>$4<>0#<02><18>$7<>0#<02><15>$4<> 0#<02>
!<21>$><3E> 0#<02> <12>M<EFBFBD> 0#<02><15>$4<>0#<02><14>O<EFBFBD>0#<02><14>O<EFBFBD>0#<02><14>O<EFBFBD>0#<02> <0A>H<EFBFBD>0#<02> <0B>F<EFBFBD>0#<02><10>K<EFBFBD>0#<02><12>M<EFBFBD>0#<02><14>O<EFBFBD>0#<02> <13>N<EFBFBD>!0#<02>"<1B>$:<3A>#0#<02>$#<23>$B<>%0#<02>&<1B>$8<>'0#<02>(<15>$4<>)0#<02>* <0A>H<EFBFBD>+0#<02>,<13>N<EFBFBD>-0#<02>0 <0A>$5<>10#<02>2 <0A>H<EFBFBD>30#<02>4<17>$6<>50#<02>6<0E>I<EFBFBD>70#<02>8<0E>I<EFBFBD>90#<02>:<10>K<EFBFBD>;0#<02><<11>L<EFBFBD>=0#<02>><15>$4<>?0#<02>@<0F>J<EFBFBD>A0#<02>D<0E>I<EFBFBD>E0#<02>F<11>L<EFBFBD>G0#<02>H<15>$4<>I0#<02>J <0A>H<EFBFBD><11>M<EFBFBD><15>$5<><14>$4<><14>$4<><14>$4<><12>N<EFBFBD><0F>K<EFBFBD><12>N<EFBFBD><0F>K<EFBFBD><0E>J<EFBFBD>_0#<02><0F>0<02>l<16><19>4 <0B>
<EFBFBD>
<EFBFBD>Y<>Z<><05>
<0C><1A><1A>G<01>
<02><06>
<0C><1A><1A> <20>
<02><06> <0F>J<EFBFBD>J<EFBFBD>-<2D> .<2E> <09><11>G<13>T6<06>z6<0F>z*<1B>Z
<12>.<2E><18>.<2E>* <1F><18> <1F><13><18> <13>
<16> <13><13>@<11>,<10>0 <0C>z<EFBFBD><19><0E> <0F><18>B<EFBFBD>
<EFBFBD>3<EFBFBD>8<EFBFBD>8<EFBFBD>}<7D><01><19>c<EFBFBD>h<EFBFBD>h<EFBFBD>q<EFBFBD>k<EFBFBD>Y<EFBFBD>6<><12>x<EFBFBD>x<EFBFBD><01>{<7B><04><1A><1A>9<EFBFBD>T<EFBFBD>F<EFBFBD>+<2B><03>#<23>B<EFBFBD><03>Y<EFBFBD>7<> <0C><04>f<EFBFBD>%<25>d<EFBFBD>D<EFBFBD>1<><06> <0A><01>f<EFBFBD>(<28>)<29>4<>A<EFBFBD>6<>7<> <0A>d<EFBFBD>j<EFBFBD>j<EFBFBD><16><01><05>><3E>?<3F> <0C>S<EFBFBD>X<EFBFBD>X<EFBFBD><1D>!<21> <1B><03><08><08><11> <0B>z<EFBFBD> 9<><12>x<EFBFBD>x<EFBFBD><01>{<7B><04>(<28><14>r<EFBFBD>2<><0F><07><16> <0A><01>f<EFBFBD>(<28>)<29>4<>C<EFBFBD><03>G<EFBFBD> <0C>~<7E>Z<EFBFBD>P<>Q<><17><07>%@<40>$<24>O<>A<EFBFBD> <11>C<EFBFBD><01>.<2E>)<29>!<21>,<2C>F<EFBFBD>1<EFBFBD>W<EFBFBD>:<3A>c<EFBFBD>?<3F>2C<32>D<> E<>P<01><1A>S<EFBFBD>X<EFBFBD>X<EFBFBD>%<25><05><11><04> <0A> <0A>c<EFBFBD>/<2F>2<>3<>3F<33>u<EFBFBD>g<EFBFBD>V<EFBFBD>T<>U<><11>"<22>E<EFBFBD>"<22><1B>B<EFBFBD>b<EFBFBD>t<EFBFBD>4<><04><11>$<24>-<2D>d<EFBFBD>V<EFBFBD>5<EFBFBD>9<>:<3A><06>H<EFBFBD>H<EFBFBD>J<EFBFBD>9r: