Light Mode

Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

diademoff/2ch

Repository files navigation

O proekte

Khochesh' skachat' vse faily s /b ili drugogo razdela? I bez dublikatov? I tol'ko faily bol'she/men'she X Kilobait? Ili tol'ko kartinki/tol'ko video? Ili faily tol'ko s konkretnogo treda? Ili tebe nuzhen treker, kotoryi budet otbirat' tredy po kliuchevym slovam? Vsio eto zdes'! I dazhe bol'she!

V repozitorii predstavlen gotovyi nabor skriptov dlia dvacha, vse skripty mozhno kastomizirovat' pod svoi zadachi. Pri minimal'nykh znaniiakh pitona mozhno s legkost'iu napisat' skript pod svoi nuzhdy. Vsia informatsiia nizhe.

Ustanovka

Videoinstruktsiia

Ustanovite python

Skachaite zip arkhiv ili:

git clone https://github.com/diademoff/2ch

Ustanovite zavisimosti:

cd 2ch
pip install -r requirements.txt

Zapuskaite nuzhnyi skript:

python {nazvanie skripta}.py

Spisok skriptov

Redaktirovanie skriptov

Vse skripty mozhno redaktirovat' pod vashi zadachi.

  • thread_saver.py
    • FOLDER = 'saver' - Izmenit' imia papki, v kotoruiu budut sokhraniat'sia faily
    • SAVE_MEDIA - Sokhraniat' li izobrazheniia i video
    • DELAY - Interval obnovleniia v sekundakh
  • tracker.py
    • text_limit = 155 - Izmenit' dlinu stroki
    • board_names = 'b news sex v hw gg dev soc rf ma psy fet' - Izmenit' spisok dosok (pisat' cherez probel)
    • KEY_WORDS - Ukazat' kliuchevye slova
  • popular.py
    • text_limit = 164 - Dlina stroki
    • max_lines = 55 - Maksimal'noe kolichestvo strok v vyvode
    • board_name = 'b' - Doska, kotoraia parsitsia
    • KEY_WORDS - Vyvodit' tredy tol'ko s kliuchevymi slovami
  • board_media.py
    • BOARD = 'b' - Imia bordy, s kotoroi skachivat' faily
    • FOLDER_NAME = 'media' - Imia papki, v kotoruiu skachivat' faily
    • KEY_WORDS = [] - Otbirat' tredy po kliuchevym slovam, esli kliuchevye slova ne ukazany, to budut skachivat'sia faily vsekh tredov
    • EXTENSIONS = [] - Faily s kakimi rasshireniiami skachivat'
    • MAX_FILE_SIZE - Zadat' maksimal'nyi razmer faila v Kilobaitakh
    • MIN_FILE_SIZE - Zadat' minimal'nyi razmer faila v Kilobaitakh

FAQ

  • Skript ne zapuskaetsia.

Prover'te ustanovleny li zavisimosti: pip install -r requirements.txt. Prover'te kodirovku failov. Prover'te, chto u vas ustanovlena versiia Python > 3.

  • Kak sravnivaiutsia izobrazheniia?

Izobrazheniia sravnivaiutsia po soderzhimomu. Dazhe esli u izobrazhenii raznoe rasshirenie png i jpg, ili raznyi razmer oni vsio ravno budut raspoznany kak odinakovye.

  • Ty ispol'zuesh' api dvacha?

Da. A konkretno:

https://2ch.hk/makaba/mobile.fcgi?task=get_thread&board={board_name}&thread={num}&post=1
http://2ch.hk/{name}/threads.json
  • Zachem tebe beautiful soup?

Preimushchestvenno chtoby ubirat' html tegi v postakh. Esli v poste zhirnyi tekst, to poluchaetsia tak: tekst. Etot teg nuzhno ubrat', chtoby ostalsia tol'ko tekst.

  • Kak ukazat' kliuchevye slova?

Otkroite nuzhnyi skript i otredaktiruite po obraztsu. Obratite vnimanie na formatirovanie, zapiatye i kavychki.

KEY_WORDS = [
"tsui'",
"mp4"
]
  • Skripty krossplatformennye?

Da. Skripty byli provereny na Linux i Windows.

Dlia razrabotchikov

Ves' api khranitsia v faile dvach.py. Podkliuchaem:

import dvach

Struktura

  • Board
    • name: str - Imia doski
    • posts: dict - Spisok postov, eto slovar'. Kliuch - eto nomer treda, znachenie - peremennaia tipa Thread
    • json_link: str - Ssylka na json tredov
    • from_json() - Poluchit' ob'ekt Board iz json'a
    • json_download() - Skachat' json doski
    • thread_exists() - Est' li na doske tred s ukazannym nomerom
    • update_threads() - Obnovit' spisok tredov na doske
    • sort_threads_by_posts() - Otsortirovat' spisok tredov po kolichestvu postov, chem blizhe element k nachalu spiska, tem bol'she v nem postov
    • get_new_threads() - Sravnit' tekushchii spisok tredov s drugim i poluchit' slovar' novykh tredov
    • get_dead_threads() - Sravnit' tekushchii spisok tredov s drugim i poluchit' slovar' utonuvshikh tredov
  • Thread
    • comment: str - Tekst v OP poste
    • num: str - Nomer treda
    • posts_count: int - Kolichestvo postov
    • score: float - Skol'ko ochkov u treda
    • subject: str - Sokrashchennyi comment
    • views: int - Kolichestvo prosmotrov
    • unique_posters: int - Kolichestvo unikal'nykh prosmotrov (poiavitsia posle obnovleniia postov)
    • board_name: str - Kakoi doske prinadlezhit tred
    • posts = [] - Spisok postov
    • get_link: str - Ssylka na tred
    • get_op_post: Post - Poluchit' OP-post
    • json_posts_link: str - Ssylka na json treda
    • save(path) - sokhranit' v html posty treda v ukazannuiu papku
    • IsOk() - Podkhodit li tred po zadannym kliuchevym slovam
    • update_posts() - Skachat' json i obnovit' ikh spisok, vyzyvaet funktsiiu get_posts()
    • get_posts() - Sparsit' json i obnovit' unique_posters i posts
    • json_download() - Poluchit' json postov v chistom vide
  • Post
    • comment: str - Tekst
    • date: str - Data posta
    • email: str
    • op: int
    • num: str - Nomer
    • files: [] - Spisok failov
  • Post_file
    • displayname: str - Otobrazhaemoe imia
    • name: str - Imia
    • download_link: str - Ssylka na skachivanie
    • width: int - Shirina
    • height: int - Vysota
    • size: int - Razmer faila
    • IsImage: bool - Iavliaetsia li fail izobrazheniem
    • IsVideo: bool - Iavliaetsia li fail video
    • save() - Sokhranit' fail po ukazannomu puti
    • IsOk() - Podkhodit li fail po zadannym rasshireniiam, maksimal'nomu i minimal'nomu razmeru

Doski

Klass Board pozvoliaet vzaimodeistvovat' s doskami (b, news, po, soc i t.d).

Ob'iavlenie:

board = dvach.Board('b')

Teper' v peremennoi board khranitsia doska b, no tam net nikakoi informatsii, krome nazvaniia doski. Chtoby poluchit' spisok tredov na doske:

board.update_threads()

Teper' v pole threads nakhoditsia slovar' s tredami. Kliuch - eto nomer treda, znachenie - eto tred (Thread).

Poluchit' spisok s nomerami tredov:

# Spisok iz nomerov tredov, kazhdyi nomer imeet strokovoi tip.
thread_nums = list(board.threads.keys())

Otsortiruem po populiarnosti i snova poluchim spisok nomerov tredov:

board.sort_threads_by_posts()
thread_nums = list(board.threads.keys())

Pervyi element teper' iavliaetsia nomerom samogo populiarnogo treda:

most_popular_num = thread_nums[0]

Tredy

My poluchili nomer samogo populiarnogo treda, teper' poluchim sam tred iz slovaria threads:

thread = board.threads[most_popular_num]

V etom slovare znachenie imeet tip Thread. Posmotrim tip peremennoi thread:

print(type(thread))

Poluchim:

Poluchim spisok postov v trede:

print(f"Kolichestvo postov (dlina posts): {len(thread.posts)}")
print(f"Kolichestvo postov (posts_count): {thread.posts_count}")

thread.update_posts()

print(f"Kolichestvo postov (dlina posts): {len(thread.posts)}")
print(f"Unikal'nykh prosmotrov: {thread.unique_posters}")

Na vykhode poluchim:

Kolichestvo postov (dlina posts): 0
Kolichestvo postov (posts_count): 60
Kolichestvo postov (dlina posts): 64
Unikal'nykh prosmotrov: 34

unique_posters - poiavliaetsia tol'ko posle vyzova update_posts() ili get_posts().

Poluchenie kolichestva postov s pomoshch'iu len(thread.posts) iavliaetsia bolee tochnym, no trebuet zagruzki vsekh postov, v to vremia kak thread.posts_count izvestno vo vremia polucheniia tredov na doske.

Sokhranenie treda v html

Dlia sokhraneniia treda ispol'zuite klass HtmlGenerator i metod get_thread_htmlpage. Etot metod vozvrashchaet html kod, kotoryi mozhno sokhranit' v fail.

op_file = thread.posts[0].files[0] # Kartinka v OP-poste
img_path = os.path.normpath(f'./{op_file.name}') # Put', kuda my ee sokhranim
op_file.save(img_path) # Sokhraniaem kartinku

# Poluchaem html
html = dvach.HtmlGenerator.get_thread_htmlpage(thread, img_path)

# Sozdaiom fail
file = open(f'thread_{thread.num}.html', 'w')

# Zapisyvaet tuda html stranitsu
file.write(html)

Ili ispol'zuite funktsiiu:

# Fail sokhranitsia v papku, v kotoroi vypolniaetsia skript s imenem thread_{num}.html
thread.save('.')

Posty

Posle polucheniia spiska postov s pomoshch'iu update_posts() v pole posts poiavilsia spisok postov nachinaia s OP-posta.

Posmotrim vtoroi post v trede:

post = thread.posts[1]

print(f"Nomer: {post.num}")
print(f"Tekst: {post.comment}")
print(f"Kolichestvo failov: {len(post.files)}")

Na vykhode poluchaem:

Nomer: 210762237
Tekst: Bamp
Kolichestvo failov: 1

Faily

Teper' poluchim pervyi fail v poste, esli fail est':

if len(post.files) > 0:
file = post.files[0]
print(type(file))

Na vykhode poluchim:

Posmotrim bol'she informatsii o faile:

print(f"Imia faila: {file.name}")
print(f"Shirina: {file.width}")
print(f"Vysota: {file.height}")
print(f"Otobrazhaemoe imia: {file.displayname}")
print(f"Ssylka: {file.download_link}")

Na vykhode:

Imia faila: 16200245064090.jpg
Shirina: 3118
Vysota: 1754
Otobrazhaemoe imia: 1620024504280.jpg
Ssylka: https://2ch.hk/b/src/245763818/16200245064090.jpg

Mozhno legko sokhranit' fail:

file.save(file.name)

Fail budet sokhranen v direktoriiu v kotoroi vypolniaetsia skript s imenem 16200245064090.jpg

Mozhno ukazat' kastomnyi put':

file.save(f"/home/username/{file.name}")

Itogo

Ves' kod, ispol'zuemyi v primerakh:

0: # Poluchit' pervyi fail file = post.files[0] print(type(file)) print(f"Imia faila: {file.name}") print(f"Shirina: {file.width}") print(f"Vysota: {file.height}") print(f"Otobrazhaemoe imia: {file.displayname}") print(f"Ssylka: {file.download_link}") # Sokhranit' fail file.save(file.name) # file.save(f"/home/username/{file.name}")">import dvach
import os

# Ob'iavit' dosku
board = dvach.Board('b')

# Skachat' tredy
board.update_threads()

# Poluchit' spisok nomerov tredov
thread_nums = list(board.threads.keys())

# Otsortirovat' po kolichestvu postov
board.sort_threads_by_posts()

# Obnovit' spisok s nomerami tredov
thread_nums = list(board.threads.keys())

# Nomer samogo populiarnogo treda
most_popular_num = thread_nums[0]

# Samyi populiarnyi tred
thread = board.threads[most_popular_num]

# Posmotret' tip peremennoi
print(type(thread))

print(f"Kolichestvo postov (dlina posts): {len(thread.posts)}")
print(f"Kolichestvo postov (posts_count): {thread.posts_count}")

# Skachat' posty
thread.update_posts()

print(f"Kolichestvo postov (dlina posts): {len(thread.posts)}")
print(f"Unikal'nykh prosmotrov: {thread.unique_posters}")

op_file = thread.posts[0].files[0] # Kartinka v OP-poste
img_path = os.path.normpath(f'./{op_file.name}') # Put', kuda my ee sokhranim
op_file.save(img_path) # Sokhraniaem kartinku

# Poluchaem html
html = dvach.HtmlGenerator.get_thread_htmlpage(thread, img_path)

# Sozdaiom fail
file = open(f'thread_{thread.num}.html', 'w')

# Zapisyvaet tuda html stranitsu
file.write(html)

# Poluchit' vtoroi post (kotoryi srazu posle OP-posta)
post = thread.posts[1]

print(f"Nomer: {post.num}")
print(f"Tekst: {post.comment}")
print(f"Kolichestvo failov: {len(post.files)}")

if len(post.files) > 0:
# Poluchit' pervyi fail
file = post.files[0]
print(type(file))

print(f"Imia faila: {file.name}")
print(f"Shirina: {file.width}")
print(f"Vysota: {file.height}")
print(f"Otobrazhaemoe imia: {file.displayname}")
print(f"Ssylka: {file.download_link}")

# Sokhranit' fail
file.save(file.name)
# file.save(f"/home/username/{file.name}")

About

Nabor skriptov dlia dvach 2ch.hk

Topics

Resources

Readme

License

MIT license

Stars

Watchers

Forks

Contributors