【浅谈系列】对 Vigenère cipher 及其变体的攻击

12 minute read

Published:

非常基本的一个问题,只是简单的做个汇总和记录,没必要细说。

非常基本的一个问题,只是简单的做个汇总和记录,没必要细说。早就遇见并解决过这类题很多次,只是密码分析学布置的作业里又给了一个变体,我解决后顺便总结一下 Vigenère cipher

0. 参考资料

1. Vigenère cipher

非常经典的多表替换密码,几乎所有应用密码学相关书籍都有细讲,不用赘述。

相比单表替换密码,由于使用了多表,使得简单的字频分析对其不再管用。

2. 分析

Vigenère cipher 的基本分析思路如下。

首先确定密钥长度。方法有两种,一是 Kasiski Test,也就是在密文里找到某个多次重复的四元组或是五元组,计算重复组里面相邻两个的距离,所有距离求最大公因数,最大公因数自然大概率就是密钥长度;二是重合指数法,通过计算重合指数,判断该分组长度是否正确,如果正确,则每个分组的重合指数非常接近 $0.065$。

其次是确定位移。密钥长度确定后的每个分组都是一个独立的单表,因此我们要将每个分组还原为一个整体的单表,所以要找到每个单表相对某个作为参照的单表的位移。还原为整体的一个单表之后才能使用字频分析等适用于单表替换密码的分析方法。确定位移可以计算互重合指数,或是相对熵,或是用卡方检验,三者都是反映两个分布间关系的。

最后是对一个单表替换的分析。如我上篇博客所述,Hill Climbing 即可。

3. 推导

在密码分析学课上所用 PPT 里,给出了理论上的推导如下,推出了密钥长度和重合指数的关系:

整体推导还是挺有意思的,给了我们平时写代码每个分组都算一遍重合指数以外不同的思路。

4. 重合指数互重合指数

不用多说,详见 密码分析学 - 冯登国

重合指数本质上也是信息论里的东西,可以化成 Renyi Entropy 的形式,推导详见 Index of coincidence - John D. Cook

注意,国内有些中文博客经常把 Renyi EntropyRayleigh quotient 搞混,一概翻译为 瑞利熵。然而这两者是完全不同的概念。其实我觉得前者应该翻译成 伦伊熵,后者是 瑞利商

5. KL 散度卡方检验

详见 Case Study: Cryptanalysis of Vigenere Ciper - Soreat_u’s Blog,这篇博客里用的就是卡方检验。

KL 散度 就是相对熵,具体定义参考 Kullback–Leibler divergence - Wikipedia。相对熵和机器学习里著名的交叉熵非常相似,可自行搜索了解。

6. 变体及攻击代码

变体很多,比如 Beaufort 等等。在 CTF 比赛里,De1CTF 2019 出现过一个变体 xorz,其实这个是直接用了 Cryptopals 上第一章的最后一题。

密码分析学课上王鹏老师布置的作业里也给了一个变体,我对其解决如下。

题目如下:

1. 攻击结果

我的编号是 $20$ 号,计算 $(20 \times 8 + 13) \pmod {45} \equiv 38$。

所以我的题目,也就是密文是:

qyxqzqabxieifewabphnqgennthtqduzjkpqfivgssirruidtoayiylehusmfsxqtvusrnthximtgcapngvfsbzvqmappiinibjjaiveholxqbnpcegrrtsmtbzcrniawjqhteuzqgemtyvnekitnhtoxnmcpcegrifeakhhkucwnnqyoighzgvomuencckqbirkhcemqpawjcutnhypmkzrsjxcizjickbduomsaccgmasjfokcrvoknetrknjfgvfhdlbntchjuptodhlsszxzcktnhqodfuruabsomydhyvcezmfinqkujsyfjxvumlcvsrtzfphcixuvkcegrgsfllydclnsrksxnqnpzorwotacdforbqckhznucqwvcbgirjqqgooteexknoziykmuuzpikrfbgibzmprymlfawpagirjahxqbugzwuogtqiuysiqcmjxqesonhxkmkbllsjqtftghwixflvizjfzsoxaknvdtltzqgwgmbhyvuzmmfinqkujgakzrhqpiexlzpinztmqeanuaaiwgibmkmkhrnyvxhehuubyzdckrgkxlzzzinsjuimplmaswxamxypmkzfiimncvyinfpvwuakqwntzjsvfqlkocpexiesmytazuzmnriwuxvtnhxojkzniqgyctzkdyluckclbniehjirlopetniynntzoghxovttnfzqsxrqsjoglkqnfiwgkmbmzoldxyngkptcpzaqjgkcmteobgziyrnhkrhrmbivsxofsssbxqlgonfggxibyycdtpvzaulbfgeyjokxufttjfzmyamtpmkflsjmswgmrsuijkavojmacxychfqlksrsbjjrhtnhxbvmyszjjyjfvxakyjieqpxagmtifyzyielngnmoydcsjeogtotyaulcqyfhtmtwrntzqgrzqsjucvitusaucjneytazhycjwzqgrpfqmhqxflvqiccufaeyyjxxocpzqgxxfejuqseqlibhcgiylzmmuhmclqxfbfqnzfqdyyvenpncxmmyyyvmlssexlbxqnbzqekrnfijteeirvsgvfavnzbpgitnhmmkkapfzqcrxiybzppqbuwxlcgmueyjotzoysnxnnzmrmkhlqbcsvbyctfscktxkbrsjuprzjsmtkxtpiwvwyctvhzfjxxrpiaptehtfzqordlssknzkmtuzhhtieayvjfzkmxakbjibmbilkhiqivkmkcomlbnfepasuiouxobskhcgmgrzhhlilssvqszifejeoxhmtlznihzviwuztzuaiirglzvwmkkddtllbrfilyaaudxdmysvbgeuaetaabizieznutfxevywjepqyvxegqgemflxqbuycxjkujfhnoktmpfnkszpmmjtgtieaucaoqrtnzsgvfqlnjuujioinxvxklfzsnlnpfemmgzhzvlawjxzreulqukuakqloeeifphngqclbaegxzreubgkuumliwiajyxakbpqtyeenuiuigjjojqspuqbgcyjnbxixqbuwoaikxteufjvmyslaptehtbhulvprnfbntxhaeruismasskbpgifebbpxdapnjnjtzvlhfgzxipizntqhuslygsokcpyjftzgsckwpuuvpxxczvyiszgsukcyawjirzbrybscrnirxoehuwphhgglvokjqtzjqsfgmesqezcledfqitbxdqnuiagkybhbhhvprlqawjslaiyjoktuystquseirvqwnzqssnbxexmgaasdhlssmjnexferywgfbnemqeyiftbmmqflawzdzbffqxjojieqprjnkujuyxgbkrnfawjwhoghzlvttoszjfzgygpoyvmzqezcleomrtaorwotgtdlnsbsdiivzmbgjugzjibzxerklssnqgqfszssqekingrrguzyhoxodfqseeqekxreumcvitpttbcsztqxkimkhawopxbhksuxgupuaqwvcbgirzzluzmswvbgwpyaitavymmwbripttqyjqqoqnuiagtruhzngkelniyhicifeuuljqctlcxotraxbfjxhurtcwsshvwhfhdgkcgjratigqlfjxxmrgkjznfqvbmhrawrbcbcsztqxkinqinenccctvhpklshuonolopymytlokxknixacsifeakhhkqssnqgthjezfoqqpqbifsxiibakkvmpubaegqgmxakadtpnpxdfqrtnyjonipmizqtcpeflopvdmtgnmqbrkwjupuzmjlcvsiztnhqcdtpvzsnjyoziyqwnzilttquskyrrkavmpuyaulcqyhwlwucujszriexmuyyzxwufntqubhzbawbumsvtimsagigzomuhrlimjiehzbsahqilxtcfnthtxaksdclnfsjqyutnhhoohflsjmpgiferuiuzmassjnugeflejdhqpimxuxvyfbnokhrnbimaegitnqnxfkcgkwshhzbdupnzkcytrmisoxakmkyrnyaegtrauncjdvukwncxezmrzosdfmseyackmqqyjpmknpenjfzxsrqxqpkoticxfkxqejukuzmjgkcgrpixuypuktfznnsrzvxbooxwrplsnfbfzvzttvgrniarjphjvbmpvarpiicxedyzyopmeslinxnbgihpxiuhkjsnxtbgiuhcgqfhflqcqthbzokpmkppizcxemixziotieaqcwgahzblyjvymlibjlnuyxruzugufwnvyjhvczsmrdsvuiwglzvuzehrawptwvcrzyzstmqemqlbnikuqnbzgvomugxcgrifeaybpksrgkskwimulkzutmpicdtshuxhxivwmrsvneeybiwbpmkrftzkgjfvepemphqvaijtwfbbzcqkaejizjfuzfsqpoqktrbqlxygmdhhoekzgwvpcbiifpfkdcmpfjhuxifejpbrdymbiagthjvhkzdtkmljndrfuxp

运行攻击代码后,得到结果如下:

100%|██████████| 1000/1000 [00:03<00:00, 313.96it/s]
15 [0.06261306532663316, 0.0649748743718593, 0.07055276381909548, 0.06271356783919597, 0.06562814070351759, 0.06914572864321608, 0.0714572864321608, 0.06613065326633166, 0.06482412060301508, 0.06723618090452262, 0.05793969849246231, 0.06984924623115578, 0.06964824120603015, 0.06452261306532664, 0.061809045226130656]
Test 1 finished.
Test 2 finished.
Test 3 finished.
Test 4 finished.
Test 5 finished.
Test 6 finished.
Test 7 finished.
Test 8 finished.
Test 9 finished.
Test 10 finished.
Test 11 finished.
------------------------
Best score: -828.2165070777663
Best key found: ['u', 'l', 'c', 't', 'k', 'b', 's', 'j', 'a', 'v', 'i', 'z', 'q', 'h', 'y', 'p', 'd', 'x', 'o', 'f', 'w', 'n', 'e', 'r', 'm', 'g']
Plaintext is: MYEARLYHOMETHEFIRSTPLACETHATICANWELLREMEMBERWASALARGEPLEASANTMEADOWWITHAPONDOFCLEARWATERINITSOMESHADYTREESLEANEDOVERITANDRUSHESANDWATERLILIESGREWATTHEDEEPENDOVERTHEHEDGEONONESIDEWELOOKEDINTOAPLOWEDFIELDANDONTHEOTHERWELOOKEDOVERAGATEATOURMASTERSHOUSEWHICHSTOODBYTHEROADSIDEATTHETOPOFTHEMEADOWWASAGROVEOFFIRTREESANDATTHEBOTTOMARUNNINGBROOKOVERHUNGBYASTEEPBANKWHILEIWASYOUNGILIVEDUPONMYMOTHERSMILKASICOULDNOTEATGRASSINTHEDAYTIMEIRANBYHERSIDEANDATNIGHTILAYDOWNCLOSEBYHERWHENITWASHOTWEUSEDTOSTANDBYTHEPONDINTHESHADEOFTHETREESANDWHENITWASCOLDWEHADANICEWARMSHEDNEARTHEGROVEASSOONASIWASOLDENOUGHTOEATGRASSMYMOTHERUSEDTOGOOUTTOWORKINTHEDAYTIMEANDCOMEBACKINTHEEVENINGTHEREWERESIJYOUNGCOLTSINTHEMEADOWBESIDESMETHEYWEREOLDERTHANIWASSOMEWERENEARLYASLARGEASGROWNUPHORSESIUSEDTORUNWITHTHEMANDHADGREATFUNWEUSEDTOGALLOPALLTOGETHERROUNDANDROUNDTHEFIELDASHARDASWECOULDGOSOMETIMESWEHADRATHERROUGHPLAYFORTHEYWOULDFREZUENTLYBITEANDKICKASWELLASGALLOPONEDAYWHENTHEREWASAGOODDEALOFKICKINGMYMOTHERWHINNIEDTOMETOCOMETOHERANDTHENSHESAIDIWISHYOUTOPAYATTENTIONTOWHATIAMGOINGTOSAYTOYOUTHECOLTSWHOLIVEHEREAREVERYGOODCOLTSBUTTHEYARECARTHORSECOLTSANDOFCOURSETHEYHAVENOTLEARNEDMANNERSYOUHAVEBEENWELLBREDANDWELLBORNYOURFATHERHASAGREATNAMEINTHESEPARTSANDYOURGRANDFATHERWONTHECUPTWOYEARSATTHENEWMARKETRACESYOURGRANDMOTHERHADTHESWEETESTTEMPEROFANYHORSEIEVERKNEWANDITHINKYOUHAVENEVERSEENMEKICKORBITEIHOPEYOUWILLGROWUPGENTLEANDGOODANDNEVERLEARNBADWAYSDOYOURWORKWITHAGOODWILLLIFTYOURFEETUPWELLWHENYOUTROTANDNEVERBITEORKICKEVENINPLAYIHAVENEVERFORGOTTENMYMOTHERSADVICEIKNEWSHEWASAWISEOLDHORSEANDOURMASTERTHOUGHTAGREATDEALOFHERHERNAMEWASDUCHESSBUTHEOFTENCALLEDHERPETOURMASTERWASAGOODKINDMANHEGAVEUSGOODFOODGOODLODGINGANDKINDWORDSHESPOKEASKINDLYTOUSASHEDIDTOHISLITTLECHILDRENWEWEREALLFONDOFHIMANDMYMOTHERLOVEDHIMVERYMUCHWHENSHESAWHIMATTHEGATESHEWOULDNEIGHWITHXOYANDTROTUPTOHIMHEWOULDPATANDSTROKEHERANDSAYWELLOLDPETANDHOWISYOURLITTLEDARKIEIWASADULLBLACKSOHECALLEDMEDARKIETHENHEWOULDGIVEMEAPIECEOFBREADWHICHWASVERYGOODANDSOMETIMESHEBROUGHTACARROTFORMYMOTHERALLTHEHORSESWOULDCOMETOHIMBUTITHINKWEWEREHISFAVORITESMYMOTHERALWAYSTOOKHIMTOTHETOWNONAMARKETDAYINALIGHTGIGTHEREWASAPLOWBOYDICKWHOSOMETIMESCAMEINTOOURFIELDTOPLUCKBLACKBERRIESFROMTHEHEDGEWHENHEHADEATENALLHEWANTEDHEWOULDHAVEWHATHECALLEDFUNWITHTHECOLTSTHROWINGSTONESANDSTICKSATTHEMTOMAKETHEMGALLOPWEDIDNOTMUCHMINDHIMFORWECOULDGALLOPOFFBUTSOMETIMESASTONEWOULDHITANDHURTUSONEDAYHEWASATTHISGAMEANDDIDNOTKNOWTHATTHEMASTERWASINTHENEJTFIELDBUTHEWASTHEREWATCHINGWHATWASGOINGONOVERTHEHEDGEHEXUMPEDINASNAPANDCATCHINGDICKBYTHEARMHEGAVEHIMSUCHABOJONTHEEARASMADEHIMROARWITHTHEPAINANDSURPRISEASSOONASWESAWTHEMASTERWETROTTEDUPNEARERTOSEEWHATWENTONBADBOYHESAIDBADBOYTOCHASETHECOLTSTHISISNOTTHEFIRSTTIMENORTHESECONDBUTITSHALLBETHELASTTHERETAKEYOURMONEYANDGOHOMEISHALLNOTWANTYOUONMYFARMAGAINSOWENEVERSAWDICKANYMOREOLDDANIELTHEMANWHOLOOKEDAFTERTHEHORSESWASXUSTASGENTLEASOURMASTERSOWEWEREWELLOFFTHEHUNTBEFOREIWASTWOYEARSOLDACIRCUMSTANCEHAPPENEDWHICHIHAVENEVERFORGOTTENITWASEARLYINTHESPRINGTHEREHADBEENALITTLEFROSTI

对得到的明文合理地添加空格,得到:

my early home the first place that i can well remember was a large pleasant meadow with a pond of clear water in it some shady trees leaned over it and rushes and water lilies grew at the deep end over the hedge on one side we looked into a plowed field and on the other we looked over a gate at our masters house which stood by the roadside at the top of the meadow was a grove of fir trees and at the bottom a running brook overhung by a steep bank while i was young i lived upon my mothers milk as i could not eat grass in the daytime i ran by her side and at night i lay down close by her when it was hot we used to stand by the pond in the shade of the trees and when it was cold we had a nice warm shed near the grove as soon as i was old enough to eat grass my mother used to go out to work in the daytime and come back in the evening there were six young colts in the meadow besides me they were older than i was some were nearly as large as grown up horses i used to run with them and had great fun we used to gallop all together round and round the field as hard as we could go sometimes we had rather rough play for they would frequently bite and kick as well as gallop one day when there was a good deal of kicking my mother whinnied to me to come to her and then she said i wish you to pay attention to what i am going to say to you the colts who live here are very good colts but they are cart horse colts and of course they have not learned manners you have been well bred and well born your father has a great name in these parts and your grandfather won the cup two years at the newmarket races your grandmother had the sweetest temper of any horse i ever knew and i think you have never seen me kick or bite i hope you will grow up gentle and good and never learn bad ways do your work with a good will lift your feet up well when you trot and never bite or kick even in play i have never forgotten my mothers advice i knew she was a wise old horse and our master thought a great deal of her her name was duchess but he often called herpetour master was a good kind man he gave us good food good lodging and kind words he spoke as kindly to us as he did to his little children we were all fond of him and my mother loved him very much when she saw him at the gate she would neigh with joy and trot up to him he would pat and stroke her and say well old pet and how is your little darkie i was a dull black so he called me darkie then he would give me a piece of bread which was very good and sometimes he brought a carrot for my mother all the horses would come to him but i think we were his favorites my mother always took him to the town on a market day in a light gig there was a plowboy dick who sometimes came into our field to pluck blackberries from the hedge when he had eaten all he wanted he would have what he called fun with the colts throwing stones and sticks at them to make them gallop we did not much mind him for we could gallop off but sometimes a stone would hit and hurt us one day he was at this game and did not know that the master was in the next field but he was there watching what was going on over the hedge he jumped in a snap and catching dick by the arm he gave him such a box on the ear as made him roar with the pain and surprise as soon as we saw the master we trotted up nearer to see what went on bad boy he said bad boy to chase the colts this is not the first time nor the second but it shall be the last there take your money and go home i shall not want you on my farm again so we never saw dick any more old daniel the man who looked after the horses was just as gentle as our master so we were well off the hunt before i was two years old a circumstance happened which i have never forgotten it was early in the spring there had been a little frost i

Google 后发现,明文来自一本 $1877$ 年的小说:Anna Sewell: Black Beauty

2. 思路简述

和普通 Vigenère 相比,改动有二。

一是影响密钥长度有四个参数。那就求其最小公倍数就好了。理论分析时,唯一担心的地方是密钥长度一算,发现很有可能视为整体的密钥长度超过密文长度,这样就无法用重合指数分析了。

我尝试 Google 搜索相关资料,最终得出结论,如果密钥长度超过密文长度,那暂时是没有暴力以外的攻击方法的。在 reddit 上我看到过一个这种挑战,最终被人用 OpenCL 开并行暴力搜索出来的,除此以外别无他法。

但实际上手的时候发现,老师还是挺仁慈的,并没有搞这么变态的设计,密钥长度反而很小。那就好办了,直接重合指数完事。

二是移位变仿射。本质上没变,从移位这个单表变成仿射这个单表罢了。

那我们把所有的 $12 \times 26 = 312$ 个仿射穷举出来,确定一个基表,每个单表对于基表算互重合指数找出其相对位移,或许这里应该叫相对仿射更为恰当一点。

两个改动的应对方法都给了,然后做法就和 Vigenère 完全一致了,密钥长度确定了,所有的多少个单表变回一个单表了,那么最后对单表做字频分析就好了。

手动字频分析太麻烦,字典法和遗传算法也有点花时间,我最终采用的是 Hill Climbing 算法。这也是我在林东岱老师的应用密码学作业里采用的方法。

详细细节可以参考我上一篇博客。

3. 攻击代码

首先是 Hill Climbing 算法。

# break a simple substitution cipher
# 2021.09.15, written by LiMinzhang
# in this code, I just use lowercase

from pycipher import SimpleSubstitution
from math import log10
import random
import string


# calculate default ngram using prepared text
class ngram(object):
    def __init__(self, datafile):
        self.default_ngrams = {}
        for line in open(datafile):
            key, count = line.split()
            self.default_ngrams[key] = int(count)
        self.key_length = len(key)
        self.key_sum = sum(self.default_ngrams.values())
        for key in self.default_ngrams.keys():
            self.default_ngrams[key] = log10(float(self.default_ngrams[key])/self.key_sum)
        self.floor = log10(0.0001/self.key_sum)    
        
        
    def calculate_score(self, text):
        score = 0
        for i in range(len(text)-self.key_length+1):
            if text[i:i+self.key_length] in self.default_ngrams:
                score += self.default_ngrams[text[i:i+self.key_length]]
            else:
                score += self.floor
        return score


# hill climbing and simulated annealing
def hill_climbing(cipher):
    text_ngram = ngram("quadgrams.txt")
    store_cipher = cipher.lower()
    cipher = store_cipher[:200]
    testkey = list(string.ascii_lowercase)
    finalscore = float("-inf")
    finalkey = ""

    times = 0
    cnt = 0
    cnt_end = 100
    upper = 10
    record = {}

    while True:
        if cnt > cnt_end:
            return None
        cnt += 1
        if times in record:
            record[times] += 1
        else:
            record[times] = 1
        if record[times] > upper:
            return (finalscore, finalkey, SimpleSubstitution(finalkey).decipher(store_cipher))
        random.shuffle(testkey)
        testplain = SimpleSubstitution(testkey).decipher(cipher)
        testscore = text_ngram.calculate_score(testplain)
        count_end = 1000
        count = 0
        while count < count_end:
            a, b = random.randint(0, 25), random.randint(0, 25)
            tmpkey = testkey[:]
            tmpkey[a], tmpkey[b] = tmpkey[b], tmpkey[a]
            tmpplain = SimpleSubstitution(tmpkey).decipher(cipher)
            tmpscore = text_ngram.calculate_score(tmpplain)
            if tmpscore > testscore:
                testscore = tmpscore
                testkey = tmpkey[:]
                count = 0
            count += 1
        if testscore > finalscore:
            finalscore, finalkey = testscore, testkey[:]
            times += 1
        print ("Test {} finished.".format(cnt))

其次是对这个魔改 Vigenère 的攻击代码。

# break Vigenère and its variant using index of coincidence
# 2021.09.18, written by LiMinzhang
# in this code, I just use lowercase

from tqdm import tqdm
from collections import Counter
import string
import re
import gmpy2


# expected letter frequencies in English
# https://en.wikipedia.org/wiki/Letter_frequency
default_letter_frequency = {
    'a': 0.08167, 'b': 0.01492, 'c': 0.02782, 'd': 0.04253,
    'e': 0.12702, 'f': 0.02228, 'g': 0.02015, 'h': 0.06094,
    'i': 0.06966, 'j': 0.00153, 'k': 0.00772, 'l': 0.04025,
    'm': 0.02406, 'n': 0.06749, 'o': 0.07507, 'p': 0.01929,
    'q': 0.00095, 'r': 0.05987, 's': 0.06327, 't': 0.09056,
    'u': 0.02758, 'v': 0.00978, 'w': 0.02361, 'x': 0.00150,
    'y': 0.01974, 'z': 0.00074
    }


# all affine transformation, length = 312
tmp_a = []
rev_a = []
for i in range(26):
    if gmpy2.gcd(i, 26) == 1:
        tmp_a.append(i)
        rev_a.append(int(gmpy2.invert(i, 26)))
tmp_a = tmp_a
rev_a = rev_a
tmp_b = list(range(26))

default_affine = []
for i in tmp_a:
    for j in tmp_b:
        default_affine.append([i, j])
assert len(default_affine) == 312

reverse_affine = []
for i in rev_a:
    for j in tmp_b:
        reverse_affine.append([i, j])
assert len(reverse_affine) == 312


# calculate index of coincidence for string x
def IC(x):
    x_length = len(x)
    x_IC = sum([i*(i-1) for i in Counter(x).values()])/((x_length-1)*x_length)
    return x_IC


# calculate mutual index of coincidence for string x and y
def MIC(x, y):
    tmp = 0
    x_length = len(x)
    y_length = len(y)
    x_letter_count = Counter(x)
    y_letter_count = Counter(y)
    for i in x_letter_count.items():
        if i[0] in y_letter_count.keys():
            tmp += i[1]*y_letter_count[i[0]]
    xy_MIC = tmp/(x_length*y_length)
    return xy_MIC


# split text without loss
def split_text(text, length):
    tmp = re.findall('.{'+str(length)+'}', text)
    tmp.append(text[len(tmp)*length:])
    return tmp


# split according to vignere 
def final_split(text, length):
    input_list = split_text(text, length)
    final_split_list = []
    length = len(input_list[0])
    input_list[-1] = input_list[-1]+(length-len(input_list[-1]))*'*'
    for i in range(length):
        final_split_list.append("".join(j[i] for j in input_list))
    for k in range(len(final_split_list)):
        final_split_list[k] = final_split_list[k].strip('*')
    return final_split_list


# recover cipher from final_split_list
def list_recover(final_split_list):
    length = len(final_split_list[0])
    cipher = ""
    for i in range(len(final_split_list)):
        final_split_list[i] += (length-len(final_split_list[i]))*'*'
    for j in range(length):
        cipher += "".join(k[j] for k in final_split_list)
    cipher = cipher.replace('*', '')
    return cipher


# guess key length using IC, assume key_length < 20
def guess_key_length(cipher):
    max_key_length = 1000
    cipher = cipher.lower()
    default_IC = 0.065
    tmp_IC = float(100)
    best_key_length = 0
    for i in tqdm(range(max_key_length)):
        test_key_length = i+1
        tmp = float(0)
        final_split_list = []
        split_list = re.findall('.{'+str(test_key_length)+'}', cipher)
        test = []
        for j in range(test_key_length):
            final_split_list.append("".join(k[j] for k in split_list))
        for z in final_split_list:
            test.append(IC(z))
            tmp += (IC(z)-default_IC)**2
        tmp /= len(final_split_list)
        if tmp < tmp_IC:
            tmp_IC = tmp
            best_key_length = test_key_length
            best_test = test
    print (best_key_length, best_test)
    return best_key_length


# affine transformation
def affine_transformation(text, which_trans):
    tmp_a, tmp_b = default_affine[which_trans]
    return "".join(chr((tmp_a*(ord(i)-97)+tmp_b)%26+97) for i in text)


# reverse affine transformation
def reverse_affine_transformation(text, which_trans):
    tmp_a, tmp_b = reverse_affine[which_trans]
    return "".join(chr((tmp_a*(ord(i)-97-tmp_b))%26+97) for i in text)


# calculate each group's shifts based on one specific group
def calculate_shift(cipher):
    cipher = cipher.lower()
    length = guess_key_length(cipher)
    final_split_list = final_split(cipher, length)
    string_shift = []
    total_distance = 100
    best_i_shift = 0
    base_string = final_split_list[0]
    for j in final_split_list[1:]:
        j_string_shift  = 0
        default_MIC = 0.065
        distance_final = 1
        for k in range(312):
            tmp_string = reverse_affine_transformation(j, k)
            tmp_distance = abs(MIC(base_string, tmp_string)-default_MIC)
            if tmp_distance < distance_final:
                distance_final = tmp_distance
                j_string_shift = k
        string_shift.append((j_string_shift, distance_final))
    
    return length, string_shift, final_split_list


# transform polyalphabetic cipher to monoalphabetic cipher using MIC
def transform_cipher(cipher):
    cipher = cipher.lower()
    length, string_shift, final_split_list = calculate_shift(cipher)
    tmp_cipher = []
    tmp_cipher.append(final_split_list[0])
    for i in range(len(final_split_list[1:])):
        shift = string_shift[i][0]
        tmp_string = reverse_affine_transformation(final_split_list[i+1], shift)
        tmp_cipher.append(tmp_string)
    monoalphabetic_cipher = list_recover(tmp_cipher) 
    return monoalphabetic_cipher


# as to Vignere, the monoalphabetic cipher just use shift like Caesar
def break_caesar_like(cipher):
    cipher = cipher.lower()
    monoalphabetic_cipher = transform_cipher(cipher)
    ans = hill_climbing(monoalphabetic_cipher)
    return ans


cipher = "qyxqzqabxieifewabphnqgennthtqduzjkpqfivgssirruidtoayiylehusmfsxqtvusrnthximtgcapngvfsbzvqmappiinibjjaiveholxqbnpcegrrtsmtbzcrniawjqhteuzqgemtyvnekitnhtoxnmcpcegrifeakhhkucwnnqyoighzgvomuencckqbirkhcemqpawjcutnhypmkzrsjxcizjickbduomsaccgmasjfokcrvoknetrknjfgvfhdlbntchjuptodhlsszxzcktnhqodfuruabsomydhyvcezmfinqkujsyfjxvumlcvsrtzfphcixuvkcegrgsfllydclnsrksxnqnpzorwotacdforbqckhznucqwvcbgirjqqgooteexknoziykmuuzpikrfbgibzmprymlfawpagirjahxqbugzwuogtqiuysiqcmjxqesonhxkmkbllsjqtftghwixflvizjfzsoxaknvdtltzqgwgmbhyvuzmmfinqkujgakzrhqpiexlzpinztmqeanuaaiwgibmkmkhrnyvxhehuubyzdckrgkxlzzzinsjuimplmaswxamxypmkzfiimncvyinfpvwuakqwntzjsvfqlkocpexiesmytazuzmnriwuxvtnhxojkzniqgyctzkdyluckclbniehjirlopetniynntzoghxovttnfzqsxrqsjoglkqnfiwgkmbmzoldxyngkptcpzaqjgkcmteobgziyrnhkrhrmbivsxofsssbxqlgonfggxibyycdtpvzaulbfgeyjokxufttjfzmyamtpmkflsjmswgmrsuijkavojmacxychfqlksrsbjjrhtnhxbvmyszjjyjfvxakyjieqpxagmtifyzyielngnmoydcsjeogtotyaulcqyfhtmtwrntzqgrzqsjucvitusaucjneytazhycjwzqgrpfqmhqxflvqiccufaeyyjxxocpzqgxxfejuqseqlibhcgiylzmmuhmclqxfbfqnzfqdyyvenpncxmmyyyvmlssexlbxqnbzqekrnfijteeirvsgvfavnzbpgitnhmmkkapfzqcrxiybzppqbuwxlcgmueyjotzoysnxnnzmrmkhlqbcsvbyctfscktxkbrsjuprzjsmtkxtpiwvwyctvhzfjxxrpiaptehtfzqordlssknzkmtuzhhtieayvjfzkmxakbjibmbilkhiqivkmkcomlbnfepasuiouxobskhcgmgrzhhlilssvqszifejeoxhmtlznihzviwuztzuaiirglzvwmkkddtllbrfilyaaudxdmysvbgeuaetaabizieznutfxevywjepqyvxegqgemflxqbuycxjkujfhnoktmpfnkszpmmjtgtieaucaoqrtnzsgvfqlnjuujioinxvxklfzsnlnpfemmgzhzvlawjxzreulqukuakqloeeifphngqclbaegxzreubgkuumliwiajyxakbpqtyeenuiuigjjojqspuqbgcyjnbxixqbuwoaikxteufjvmyslaptehtbhulvprnfbntxhaeruismasskbpgifebbpxdapnjnjtzvlhfgzxipizntqhuslygsokcpyjftzgsckwpuuvpxxczvyiszgsukcyawjirzbrybscrnirxoehuwphhgglvokjqtzjqsfgmesqezcledfqitbxdqnuiagkybhbhhvprlqawjslaiyjoktuystquseirvqwnzqssnbxexmgaasdhlssmjnexferywgfbnemqeyiftbmmqflawzdzbffqxjojieqprjnkujuyxgbkrnfawjwhoghzlvttoszjfzgygpoyvmzqezcleomrtaorwotgtdlnsbsdiivzmbgjugzjibzxerklssnqgqfszssqekingrrguzyhoxodfqseeqekxreumcvitpttbcsztqxkimkhawopxbhksuxgupuaqwvcbgirzzluzmswvbgwpyaitavymmwbripttqyjqqoqnuiagtruhzngkelniyhicifeuuljqctlcxotraxbfjxhurtcwsshvwhfhdgkcgjratigqlfjxxmrgkjznfqvbmhrawrbcbcsztqxkinqinenccctvhpklshuonolopymytlokxknixacsifeakhhkqssnqgthjezfoqqpqbifsxiibakkvmpubaegqgmxakadtpnpxdfqrtnyjonipmizqtcpeflopvdmtgnmqbrkwjupuzmjlcvsiztnhqcdtpvzsnjyoziyqwnzilttquskyrrkavmpuyaulcqyhwlwucujszriexmuyyzxwufntqubhzbawbumsvtimsagigzomuhrlimjiehzbsahqilxtcfnthtxaksdclnfsjqyutnhhoohflsjmpgiferuiuzmassjnugeflejdhqpimxuxvyfbnokhrnbimaegitnqnxfkcgkwshhzbdupnzkcytrmisoxakmkyrnyaegtrauncjdvukwncxezmrzosdfmseyackmqqyjpmknpenjfzxsrqxqpkoticxfkxqejukuzmjgkcgrpixuypuktfznnsrzvxbooxwrplsnfbfzvzttvgrniarjphjvbmpvarpiicxedyzyopmeslinxnbgihpxiuhkjsnxtbgiuhcgqfhflqcqthbzokpmkppizcxemixziotieaqcwgahzblyjvymlibjlnuyxruzugufwnvyjhvczsmrdsvuiwglzvuzehrawptwvcrzyzstmqemqlbnikuqnbzgvomugxcgrifeaybpksrgkskwimulkzutmpicdtshuxhxivwmrsvneeybiwbpmkrftzkgjfvepemphqvaijtwfbbzcqkaejizjfuzfsqpoqktrbqlxygmdhhoekzgwvpcbiifpfkdcmpfjhuxifejpbrdymbiagthjvhkzdtkmljndrfuxp"
result = break_caesar_like(cipher)
if result:
    print ("------------------------")
    print ("Best score: {}".format(result[0]))
    print ("Best key found: {}".format(result[1]))
    print ("Plaintext is: {}".format(result[2]))
else:
    print ("ERROR")

声明:本文采用 CC BY-NC-SA 4.0 授权。

This work is licensed under a CC BY-NC-SA 4.0.