{"id":1941,"date":"2015-08-21T20:57:50","date_gmt":"2015-08-21T19:57:50","guid":{"rendered":"https:\/\/stevepedwards.today\/DebianAdmin\/?p=1941"},"modified":"2023-10-28T21:06:05","modified_gmt":"2023-10-28T20:06:05","slug":"using-awk-and-sed-to-cut-a-column-list-and-for-character-substitution","status":"publish","type":"post","link":"https:\/\/stevepedwards.today\/DebianAdmin\/using-awk-and-sed-to-cut-a-column-list-and-for-character-substitution\/","title":{"rendered":"Using Awk, Sed, Cut and TR To Cut a Column List for Character Substitution and Nmap Bad Ports List"},"content":{"rendered":"<div class=\"pvc_clear\"><\/div>\n<p id=\"pvc_stats_1941\" class=\"pvc_stats all  \" data-element-id=\"1941\" style=\"\"><i class=\"pvc-stats-icon medium\" aria-hidden=\"true\"><svg aria-hidden=\"true\" focusable=\"false\" data-prefix=\"far\" data-icon=\"chart-bar\" role=\"img\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" viewBox=\"0 0 512 512\" class=\"svg-inline--fa fa-chart-bar fa-w-16 fa-2x\"><path fill=\"currentColor\" d=\"M396.8 352h22.4c6.4 0 12.8-6.4 12.8-12.8V108.8c0-6.4-6.4-12.8-12.8-12.8h-22.4c-6.4 0-12.8 6.4-12.8 12.8v230.4c0 6.4 6.4 12.8 12.8 12.8zm-192 0h22.4c6.4 0 12.8-6.4 12.8-12.8V140.8c0-6.4-6.4-12.8-12.8-12.8h-22.4c-6.4 0-12.8 6.4-12.8 12.8v198.4c0 6.4 6.4 12.8 12.8 12.8zm96 0h22.4c6.4 0 12.8-6.4 12.8-12.8V204.8c0-6.4-6.4-12.8-12.8-12.8h-22.4c-6.4 0-12.8 6.4-12.8 12.8v134.4c0 6.4 6.4 12.8 12.8 12.8zM496 400H48V80c0-8.84-7.16-16-16-16H16C7.16 64 0 71.16 0 80v336c0 17.67 14.33 32 32 32h464c8.84 0 16-7.16 16-16v-16c0-8.84-7.16-16-16-16zm-387.2-48h22.4c6.4 0 12.8-6.4 12.8-12.8v-70.4c0-6.4-6.4-12.8-12.8-12.8h-22.4c-6.4 0-12.8 6.4-12.8 12.8v70.4c0 6.4 6.4 12.8 12.8 12.8z\" class=\"\"><\/path><\/svg><\/i> <img loading=\"lazy\" decoding=\"async\" width=\"16\" height=\"16\" alt=\"Loading\" src=\"https:\/\/stevepedwards.today\/DebianAdmin\/wp-content\/plugins\/page-views-count\/ajax-loader-2x.gif\" border=0 \/><\/p>\n<div class=\"pvc_clear\"><\/div>\n<p>*BEWARE! a \u00a0note on terminal characters before you start: apostrophe pairs can be problematic for copy\/paste operations between different editors, term types, WordPress etc. The common standard is now UTF8, so make sure you set PuTTY correctly using the Translation option:<\/p>\n<p><a href=\"https:\/\/stevepedwards.today\/DebianAdmin\/wp-content\/uploads\/2015\/08\/UTF8Term.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-1998 aligncenter\" src=\"https:\/\/stevepedwards.today\/DebianAdmin\/wp-content\/uploads\/2015\/08\/UTF8Term.jpg\" alt=\"UTF8Term.jpg\" width=\"779\" height=\"803\" \/><\/a><\/p>\n<p>This may still paste one of the two apostrophes in the wrong \"direction\" and fail for sed and awk command lines - it is easy to miss these visually when they don't translate correctly in a paste! e.g.<\/p>\n<p><strong>awk `{print $2}`<\/strong> (i.e with backtics that don't show in the HTML!)<\/p>\n<p><strong>awk '{print $2}'<\/strong><\/p>\n<p>The first will fail, the second won't (as seen in the Worpress editor but not on the Post page, which translates correctly\u00a0again!). These pastes were from the same source (this page!), into PuTTY then back here again, before PuTTY was set to UTF8!<\/p>\n<p>-<\/p>\n<p><span style=\"font-size: 12pt;\">I want to separate a list of known Trojan\/virus ports to experiment with an Nmap scan, so needed a quick way to cut out just the port numbers from a column listing of some known dodgy ports I found here:<\/span><\/p>\n<p><a href=\"https:\/\/www.simovits.com\/trojans\/trojans.html\"><span style=\"font-size: 12pt;\">https:\/\/www.simovits.com\/trojans\/trojans.html<\/span><\/a><span style=\"font-size: 12pt;\"><br \/>\n<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter\" src=\"https:\/\/stevepedwards.today\/DebianAdmin\/wp-content\/uploads\/2015\/08\/082115_1957_UsingAwkand1.png\" alt=\"\" width=\"823\" height=\"517\" \/><span style=\"font-size: 12pt;\"><br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">As it is in a column format conveniently, linux has some perfect tools to strip these how you want.<br \/>\n<\/span><\/p>\n<p>The site seems down since yesterday, so here's a list to practice with:<\/p>\n<p>port 2 Death<br \/>\nport 20 Senna Spy FTP server<br \/>\nport 21 Back Construction, Blade Runner, Doly Trojan, Fore, Invisible FTP, Juggernaut 42 , Larva, MotIv FTP, Net Administrator, Senna Spy FTP server, Traitor 21,<br \/>\nWebEx, WinCrash<br \/>\nport 22 Shaft<br \/>\nport 23 Fire HacKer, Tiny Telnet Server - TTS, Truva Atl<br \/>\nport 25 Ajan, Antigen, Email Password Sender - EPS, EPS II, Gip, Gris, Happy99, Hpteam mail, I love you, Kuang2, Magic Horse, MBT (Mail Bombing Trojan),<br \/>\nMoscow Email trojan, Naebi, NewApt worm, ProMail trojan, Shtirlitz, Stealth, Tapiras, Terminator, WinPC, WinSpy<br \/>\nport 31 Agent 31, Hackers Paradise, Masters Paradise<br \/>\nport 41 Deep Throat, Foreplay or Reduced Foreplay<br \/>\nport 48 DRAT<br \/>\nport 50 DRAT<\/p>\n<p>OR get the txt file already done here:<\/p>\n<p><a href=\"https:\/\/stevepedwards.today\/DebianAdmin\/wp-content\/uploads\/2015\/08\/BadPortsBig.txt\">BadPortsBig.txt<\/a><\/p>\n<p><span style=\"font-size: 12pt;\">As I want the port numbers from the 2<sup>nd<\/sup> column only at this point, I selected and copied all the text from the browser, and pasted it into a text file on the command line:<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: #3333ff;\">vi \/badportsBIG.txt<\/span><br \/>\n<\/span><\/p>\n<p>(If you don't want to do all the examples below in depth, a summary of the commands used to quickly convert the port columns to CSV text, with no duplicates, is on the Notepad page.)<\/p>\n<p><span style=\"font-size: 12pt;\">Press the \"I\" key in vim for text Insert, then the middle mouse button (or both side buttons) to paste all the text in:<br \/>\n<\/span><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/stevepedwards.today\/DebianAdmin\/wp-content\/uploads\/2015\/08\/082115_1957_UsingAwkand2.png\" alt=\"\" \/><span style=\"font-size: 12pt;\"><br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">Now you just have to strip out columns 1 and 3 which are nicely space delimited, by choosing only column 2 I love linux's ability for this kind of operation:<br \/>\n<\/span><\/p>\n<p><span style=\"color: blue; font-size: 12pt;\">awk '{print $2}' \/badportsBIG.txt<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">This gives you a STDOUT on the screen leaving the original file intact.<br \/>\n<\/span><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/stevepedwards.today\/DebianAdmin\/wp-content\/uploads\/2015\/08\/082115_1957_UsingAwkand3.png\" alt=\"\" \/><span style=\"font-size: 12pt;\"><br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">Just append this output to a new empty file to keep this data:<br \/>\n<\/span><\/p>\n<p><span style=\"color: blue; font-size: 12pt;\">awk '{print $2}' \/badportsBIG.txt &gt; \/badportsBIGports.txt<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">Now you have a text file with just the port numbers in.<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">From research there appears to be no easy way for Nmap to read port numbers from a file - only hosts\/IP adresses etc. - which I find hard to believe but seems so, as the port numbers have to follow the -p switch, then be comma delimited either before or after the target - in the form e.g.:<br \/>\n<\/span><\/p>\n<p><span style=\"color: blue; font-size: 12pt;\">nmap <strong><em>-p 123,124,125,148<\/em><\/strong> 192.168.1.1<br \/>\n<\/span><\/p>\n<p><strong><span style=\"color: #ff0000;\"><em>(A year later I found the simple answer to that poor assumption of appending file content, using the $(variable) format - linux can manipulate data\u00a0how you want - if\u00a0you know what you're doing!)\u00a0<\/em><\/span><\/strong><\/p>\n<p><span style=\"color: #00ff00;\"><a style=\"color: #00ff00;\" href=\"https:\/\/stevepedwards.today\/DebianAdmin\/experiment-with-pipes-redirection-command-substitution-and-variable-expansion\/\"><strong><em>https:\/\/stevepedwards.today\/DebianAdmin\/experiment-with-pipes-redirection-command-substitution-and-variable-expansion\/<\/em><\/strong><\/a><\/span><\/p>\n<p><span style=\"font-size: 12pt;\">This is a pain, because commas now have to be appended between the file's port numbers, and the return delimiters removed so they are all on one big line, so it can be pasted onto Nmap's command line.<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">How do you insert the commas after each port number, then remove the carriage returns\/whitespaces?<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">Note; you have to think like a programmer here about the order you do things. If you removed the white space first, you would be left with one massive number! You can't put the commas in then without severe pain.<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">Research the net as usual or check the UNIX: Database Approach book I mentioned a few Posts back.<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">I originally found that book back in 2001 in New Zealand, as it is big on fundamental programmer's tools like sed, awk, cut, diff and sort for manipulating text at a character level (as databases were written in column and row formats, needing character definitions etc., on the command line only before GUIs back then, as they still can be in SQL\/mysql etc. now, but I never practised with them.<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">Now's the time for a review with an actual real goal at hand.<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">There are some sed and awk examples for commas here:<br \/>\n<\/span><\/p>\n<p><a href=\"https:\/\/stackoverflow.com\/questions\/8714355\/bash-turning-multi-line-string-into-single-comma-separated\"><span style=\"font-size: 12pt;\">https:\/\/stackoverflow.com\/questions\/8714355\/bash-turning-multi-line-string-into-single-comma-separated<\/span><\/a><span style=\"font-size: 12pt;\"><br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">Let's try with one on the initial file, as this example also splits off the second column, but does the whole required process in one go, provided cat is used to read my file FIRST, so:<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: #0000ff;\"><span style=\"background-color: white;\">awk -vORS=, <\/span><span style=\"background-color: white;\">'{ print $2 }'<\/span><span style=\"background-color: white;\"> file.txt | sed <\/span><\/span><span style=\"color: maroon; background-color: white;\"><span style=\"color: #0000ff;\">'s\/,$\/\\n\/'<\/span><br \/>\n<\/span><\/span><\/p>\n<p><span style=\"font-size: 12pt;\">becomes:<br \/>\n<\/span><\/p>\n<p><span style=\"color: #3333ff; font-size: 12pt;\">cat \/badportsBIG.txt | awk -vORS=, '{ print $2 }' | sed 's\/,$\/\\n\/'<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">which outputs:<br \/>\n<\/span><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/stevepedwards.today\/DebianAdmin\/wp-content\/uploads\/2015\/08\/082115_1957_UsingAwkand4.png\" alt=\"\" \/><span style=\"font-size: 12pt;\"><br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">Just what is required to paste the port numbers to nmap. That is so cool!<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">Let's save that STDOUT to a new file:<br \/>\n<\/span><\/p>\n<p><span style=\"color: #3333ff; font-size: 12pt;\">cat \/badportsBIG.txt | awk -vORS=, '{ print $2 }' | sed 's\/,$\/\\n\/' &gt; \/badportsBIGcommas.txt<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">Let's just try nmap with those ports on only one host to start:<br \/>\n<\/span><\/p>\n<p><span style=\"color: #3333ff; font-size: 12pt;\">nmap 192.168.1.1 -p<\/span><br \/>\n<span style=\"color: #3333ff; font-size: 12pt;\">0,1,2,5,11,16,17,18,19,20,21,22,23,25,27,28,30,31,37,39,41,44,51,52,53,54,66,69,69,70,79,<strong>80,80<\/strong>,81,101,102,103,105,107,109,110,111,113,120,121,123,137,137,138,139,143,146,146,166,170,171,200,201,202,211,212,221,222,230,231,232,285,299,334,335,370,400,401,402,411,420,443,445,455,511,513,514,515,520,555,564,589,600,623,635,650,661,666,667,668,669,680,692,700,777,798,808,831,901,902,903,911,956,991,992,999,1000,1001,1005,1008,1010,1011,1012,1015,1016,1020,1024,1025,1025,1026,1026,1027,1028,1028,1029,1029,1030,1031,1031,1032,1032,1033,1034,1035,1036,1037,1039,1041,1042,1042,1043,1044,1044,1047,1049,1052,1053,1054,1080,1081,1082,1083,1092,1095,1097,1098,1099,1104,1111,1111,1115,1116,1116,1122,1122,1133,1150,1151,1160,1166,1167,1170,1180,1183,1183,1200,1201,1207,1208,1212,1215,1218,1219,1221,1222,1234,1243,1245,1255,1256,1272,1313,1314,1349,1369,1386,1415,1433,1441,1492,1524,1560,1561,1600,1601,1602,1703,1711,1772,1772,1777,1826,1833,1834,1835,1836,1837,1905,1911,1966,1967,1978,1981,1983,1984,1985,1985,1986,1991,1999,2000,2000,2001,2001,2002,2002,2004,2005,2023,2060,2080,2101,2115,2130,2140,2140,2149,2150,2156,2222,2222,2281,2283,2300,2311,2330,2331,2332,2333,2334,2335,2336,2337,2338,2339,2339,2343,2345,2407,2418,2555,2565,2583,2589,2600,2702,2702,2772,2773,2774,2800,2929,2983,2989,3000,3006,3024,3031,3119,3128,3129,3131,3150,3150,3215,3215,3292,3295,3333,3333,3410,3417,3418,3456,3459,3505,3700,3721,3723,3777,3791,3800,3801,3945,3996,3996,3997,3999,4000,4092,4128,4128,4156,4201,4210,4211,4225,4242,4315,4321,4414,4442,4444,4445,4447,4449,4451,4488,4567,4653,4666,4700,4836,5000,5001,5002,5005,5011,5025,5031,5032,5050,5135,5150,5151,5152,5155,5221,5250,5321,5333,5350,5377,5400,5401,5402,5418,5419,5419,5430,5450,5503,5534,5550,5555,5555,5556,5557,5569,5650,5669,5679,5695,5696,5697,5742,5802,5873,5880,5882,5882,5888,5888,5889,5933,6000,6006,6267,6400,6521,6526,6556,6661,6666,6666,6667,6667,6669,6670,6697,6711,6712,6713,6714,6715,6718,6723,6766,6766,6767,6767,6771,6776,6838,6891,6912,6969,6970,7000,7001,7007,7020,7030,7119,7215,7274,7290,7291,7300,7301,7306,7307,7308,7312,7410,7424,7424,7597,7626,7648,7673,7676,7677,7718,7722,7777,<strong>7788,7788<\/strong>,7789,7800,7826,7850,7878,7879,7979,7983,8011,8012,8012,8080,<strong>8090,8090<\/strong>,8097,8100,8110,8111,8127,8127,8130,8131,8301,8302,8311,8322,8329,8488,8489,8489,8685,8732,8734,8787,8811,8812,8821,8848,8864,8888,9000,9090,9117,9148,9301,9325,9329,9400,9401,9536,9561,9563,9870,9872,9873,9874,9875,9876,9877,9878,9879,9919,9999,10000,10000,10001,10002,10003,10008,10012,10013,10067,10067,10084,10084,10085,10086,10100,10100,10167,10167,10498,10520,10528,10607,10666,10887,10889,11000,11011,11050,11051,11111,11223,11225,11225,11660,11718,11831,11977,11978,11980,12000,12310,12321,12321,12345,12345,12346,12348,12349,12361,12362,12363,12623,12623,12624,12631,12684,12754,12904,13000,13013,13014,13028,13079,13370,13371,13500,13753,14194,14285,14286,14287,14500,14501,14502,14503,15000,15092,15104,15206,15207,15210,15382,15432,15485,15486,15486,15500,15512,15551,15695,15845,15852,16057,16484,16514,16514,16515,16515,16523,16660,16712,16761,16959,17166,17449,17499,17500,17569,17593,17777,18753,19191,19216,20000,20001,20002,20005,20023,20034,20331,20432,20433,21212,21544,21554,21579,21957,22115,22222,22223,22456,22554,22783,22784,22785,23000,23001,23005,23006,23023,23032,23321,23432,23456,23476,23476,23477,23777,24000,24289,25002,25002,25123,25555,25685,25686,25799,25885,25982,26274,26681,27160,27184,27184,27373,27374,27379,27444,27573,27665,28218,28431,28678,29104,29292,29559,29589,29589,29891,29999,30000,30001,30005,30100,30101,30102,30103,30103,30133,30303,30331,30464,30700,30947,31320,31320,31335,31336,31337,31337,31338,31338,31339,31339,31340,31340,31382,31415,31416,31416,31557,31745,31785,31787,31788,31789,31789,31790,31791,31791,31792,31887,32000,32001,32100,32418,32791,33270,33333,33545,33567,33568,33577,33777,33911,34312,34313,34324,34343,34444,34555,35000,35555,35600,36794,37237,37651,38741,38742,40071,40308,40412,40421,40422,40423,40425,40426,41337,41666,43720,<strong>44014,44014<\/strong>,44444,44575,44767,44767,45092,45454,45632,45673,46666,46666,47017,47262,47698,47785,47785,47891,48004,48006,48512,49000,49683,49683,49698,50000,50021,50130,50505,50551,50552,50766,50829,50829,51234,51966,52365,52901,53001,54283,54320,54321,55165,55555,55665,55666,56565,57163,57341,57785,58134,58339,59211,60000,60001,60008,60068,60411,60551,60552,60666,61115,61337,61348,61440,61603,<strong>61746,61746<\/strong>,<strong>61747,61747<\/strong>,61748,61979,62011,63485,64101,65000,65289,65421,65422,65432,65432,65530,65535<\/span><\/p>\n<p><span style=\"color: #ff3333; font-size: 12pt;\">Starting Nmap 6.40 ( <a href=\"https:\/\/nmap.org\">https:\/\/nmap.org<\/a> ) at 2015-08-21 12:52 BST<br \/>\n<\/span><\/p>\n<p><span style=\"color: #ff3333; font-size: 12pt;\">WARNING: Duplicate port number(s) specified. Are you alert enough to be using Nmap? Have some coffee or Jolt(tm).<br \/>\n<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter\" src=\"https:\/\/stevepedwards.today\/DebianAdmin\/wp-content\/uploads\/2015\/08\/082115_1957_UsingAwkand5.png\" alt=\"\" width=\"697\" height=\"508\" \/><span style=\"color: #ff3333; font-size: 12pt;\"><br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">Ok, what's it being sarcastic about?<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">It's right, the list HAS got many duplicate ports on close inspection, though nmap still works, e.g:<br \/>\n<\/span><\/p>\n<p><span style=\"color: #3333ff; font-size: 12pt;\"><strong>80,80,8090,8090,61746,61746 <\/strong>etc.<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">For now though, before searching for more stripping tools, part of the output from the nmap scan for all these ports across 192.168.1.1-254 gave:<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">Nmap scan report for 192.168.1.244<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">Host is up (0.000062s latency).<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">Not shown: 787 closed ports<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">PORT STATE SERVICE<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">22\/tcp open ssh<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">80\/tcp open http<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">139\/tcp open netbios-ssn<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">445\/tcp open microsoft-ds<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">This shows the words closed or filtered in all cases for this scan, (good!) but if you want to find any PCs that have open ports on ANY of these ports (bad!?) for further investigation, then pipe and append grep \"open\" on the nmap line:<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: #3333ff;\">nmap 192.168.1.1-254 -p 0,1,2 | grep open<\/span><br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">You can then decide what ports are legit on that list and which are suspect from the output e.g:<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">21\/tcp open ftp<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">22\/tcp open ssh<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">23\/tcp open telnet<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">80\/tcp open http<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">443\/tcp open http<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">515\/tcp open printer<br \/>\n<\/span><\/p>\n<p>-----<\/p>\n<p><span style=\"font-size: 12pt;\">So, what tools can remove those duplicates? Uniq!<br \/>\n<\/span><\/p>\n<p>e.g.:<\/p>\n<p>This leaves the original file unchanged, so you need to pipe the fix to another file e.g.<\/p>\n<p><span style=\"color: #0000ff;\">cat \/aabbcc.txt<\/span><br \/>\n<span style=\"color: #ff0000;\">a<\/span><br \/>\n<span style=\"color: #ff0000;\">a<\/span><br \/>\n<span style=\"color: #ff0000;\">b<\/span><br \/>\n<span style=\"color: #ff0000;\">b<\/span><br \/>\n<span style=\"color: #ff0000;\">c<\/span><br \/>\n<span style=\"color: #ff0000;\">c<\/span><br \/>\n<span style=\"color: #ff0000;\">d<\/span><br \/>\n<span style=\"color: #ff0000;\">d<\/span><\/p>\n<p><span style=\"color: #0000ff;\">uniq \/aabbcc.txt<\/span><br \/>\n<span style=\"color: #ff0000;\">a<\/span><br \/>\n<span style=\"color: #ff0000;\">b<\/span><br \/>\n<span style=\"color: #ff0000;\">c<\/span><br \/>\n<span style=\"color: #ff0000;\">d<\/span><\/p>\n<p><span style=\"color: #0000ff;\">uniq \/aabbcc.txt &gt; \/abcd.txt<\/span><\/p>\n<p>-------<\/p>\n<p><span style=\"font-size: 12pt;\">Anyway...I don't want to mess up my good original at this point so I'll back it up:<br \/>\n<\/span><\/p>\n<p><span style=\"color: #3333ff; font-size: 12pt;\">cp -v \/badportsBIGcommas.txt \/badportsBIGcommas.bak<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">This sed line is supposed to remove duplicate words, but maybe only if white space delimited so won't work for comma separated text:<br \/>\n<\/span><\/p>\n<p><span style=\"color: #3333ff; font-size: 12pt;\">sed -ri 's\/(.* )1\/1\/g' \/badportsBIGcommas.bak<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">bash: syntax error near unexpected token `('<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Hmm...doesn't even parse correctly may be due to UTF8 chars in linux or a keymap error for the apostrophes:<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Maybe it means this:<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: #004dbb;\">sed -ri 's\/(.* )1\/1\/g'<\/span><span style=\"color: #333333;\"><strong><br \/>\n<\/strong><\/span><\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">as it completed with the new apostrophes. It did not change anything though, seen by looking for known dupes in cat \/badportsBIGcommas.bak e.g. port 80<br \/>\n<\/span><\/p>\n<p><span style=\"color: #3333ff; font-size: 12pt;\">cat \/badportsBIGcommas.bak | grep 80<br \/>\n<\/span><\/p>\n<p><span style=\"color: #3333ff; font-size: 12pt;\">No change as there are 2 consecutive port 80s showing:<br \/>\n<\/span><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/stevepedwards.today\/DebianAdmin\/wp-content\/uploads\/2015\/08\/082115_1957_UsingAwkand6.png\" alt=\"\" \/><span style=\"font-size: 12pt;\"><br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Let's check that command works for white space delimited words from his original example:<br \/>\n<\/span><\/p>\n<p><span style=\"color: #3333ff; font-size: 12pt;\">vi \/abc.txt<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">This is is how it works buddy<br \/>\nWhat <strong>else else<\/strong> you want<br \/>\n<\/span><\/p>\n<p><span style=\"color: #3333ff; font-size: 12pt;\">sed -ri 's\/(.* )1\/1\/g' \/abc.txt<br \/>\n<\/span><\/p>\n<p><span style=\"color: #ff3333; font-size: 12pt;\">This is is how it works buddy<br \/>\n<\/span><\/p>\n<p><span style=\"color: #ff3333; font-size: 12pt;\">What <strong>else else<\/strong> you want<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Nope.<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">This turned out to be another monumental search for such a seemingly simple operation! It may have to be done with a script containing many tools?<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Best read the info page:<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: #3333ff;\">info sed<\/span><span style=\"color: black;\"><br \/>\n<\/span><\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">It may have been better to remove dupes before the comma addition operation too\u00a6<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">However, it gave a chance to learn some sed format commands from my UNIX:DB book, to see how this <strong>s<\/strong>tream <strong>ed<\/strong>itor works, and maybe find out why that command above is flawed. Options that can be used with sed include:<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\"><strong><em>i \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0inserts before a line<br \/>\n<\/em><\/strong><\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\"><strong><em>a \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0appends after a line<br \/>\n<\/em><\/strong><\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\"><strong><em>c \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0changes line<br \/>\n<\/em><\/strong><\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\"><strong><em>d \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0deletes line<br \/>\n<\/em><\/strong><\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\"><strong><em>p \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0prints line STDOUT<br \/>\n<\/em><\/strong><\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\"><strong><em>q \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0quits after reading up to addressed line<br \/>\n<\/em><\/strong><\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\"><strong><em>r fname \u00a0\u00a0\u00a0\u00a0places content of filename<br \/>\n<\/em><\/strong><\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\"><strong><em>= \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0prints line number addressed<br \/>\n<\/em><\/strong><\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\"><strong><em>s\/s1\/s2\/ \u00a0\u00a0\u00a0\u00a0Subtitutes string s1 by string s2<br \/>\n<\/em><\/strong><\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\"><strong><em>y\/s1\/s2 \u00a0\u00a0\u00a0\u00a0Transforms characters in line by mapping each char in s1 with its counterpart in s2<br \/>\n<\/em><\/strong><\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Now you can see what the example above is driving at in that substitution.<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">sed evolved from earlier Unix text editors, ed, then ex, that use an address space context to id lines, so for the 2 line example in \/abc.txt, you can show the whole file like cat would:<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: black;\">HPMint stevee # <\/span><span style=\"color: #004dbb;\">sed '2q' \/abc.txt<\/span><span style=\"color: black;\"><br \/>\n<\/span><\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">This is is how it works buddy<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: red;\">What else else you want<\/span><span style=\"color: black;\"><br \/>\n<\/span><\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Where q is a quit after outputting number of requested lines, or p to print out.<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">This is similar to:<br \/>\n<\/span><\/p>\n<p><span style=\"color: #004dbb; font-size: 12pt;\">HPMint stevee # head -1 \/abc.txt<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: red;\">This is is how it works buddy<\/span><span style=\"color: #004dbb;\"><br \/>\n<\/span><\/span><\/p>\n<p><span style=\"color: #004dbb; font-size: 12pt;\">HPMint stevee # tail -1 \/abc.txt<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">What else else you want<br \/>\n<\/span><\/p>\n<p><span style=\"color: #004dbb; font-size: 12pt;\">HPMint stevee # more -1 \/abc.txt<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: red;\">This is is how it works buddy<\/span><span style=\"color: black;\"><br \/>\n<\/span><\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: red;\">--More--(55%)<\/span><span style=\"color: black;\"><br \/>\n<\/span><\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">or the initial cursor position for the line number in vim:<br \/>\n<\/span><\/p>\n<p><span style=\"color: #004dbb; font-size: 12pt;\">HPMint stevee # vi +1 \/abc.txt<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">But, a major difference is sed outputs its operation and operants to STDOUT by default, so will show the lines to be operated on, as well as the result, so print them again if the p option is used, so you get line doubling for output e.g:<br \/>\n<\/span><\/p>\n<p><span style=\"color: #004dbb; font-size: 12pt;\">HPMint stevee # sed 'p' \/abc.txt<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">This is is how it works buddy<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">This is is how it works buddy<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">What else else you want<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">What else else you want<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">To negate this, use the -n number switch with no address to id a specific line, so showing the whole file, up to max buffer memory:<br \/>\n<\/span><\/p>\n<p><span style=\"color: #004dbb; font-size: 12pt;\">HPMint stevee # sed -n 'p' \/abc.txt<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">This is is how it works buddy<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: red;\">What else else you want<\/span><span style=\"color: black;\"><br \/>\n<\/span><\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">The line addressing format shows now in:<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: black;\">HPMint stevee # <\/span><span style=\"color: #004dbb;\">sed -n '1p' \/abc.txt<\/span><span style=\"color: black;\"><br \/>\n<\/span><\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: red;\">This is is how it works buddy<\/span><span style=\"color: black;\"><br \/>\n<\/span><\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: black;\">HPMint stevee # <\/span><span style=\"color: #004dbb;\">sed -n '2p' \/abc.txt<\/span><span style=\"color: black;\"><br \/>\n<\/span><\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">What else else you want<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">sed also responds to the NOT command \"!\" which gives the negative of the above line 1 address, showing line 2 instead or vice versa e.g:<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: black;\">HPMint stevee # <\/span><span style=\"color: #004dbb;\">sed -n '1!p' \/abc.txt<br \/>\n<\/span><\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">What else else you want<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\">HPMint stevee # <span style=\"color: #004dbb;\">sed -n '2!p' \/abc.txt<\/span><span style=\"color: red;\"><br \/>\n<\/span><\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: red;\">This is is how it works buddy<\/span><span style=\"color: black;\"><br \/>\n<\/span><\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">So, what about the initial example to replace s1 with s2 for the \"else\" word?<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Can it be understood now from the above research? I still need to know the meaning of the \".\" which I think, from a long time ago, means any single character? So, any single char followed by ANY other combination of none or all chars, because the * is included...?<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">His example was:<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: #3333ff;\">sed -ri 's\/(.* )1\/1\/g' \/abc.txt<\/span><span style=\"color: black;\"><br \/>\n<\/span><\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">From the man page:<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">s\/regexp\/replacement\/<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\"> Attempt to match regexp against the pattern space. If successful, replace that portion matched with replacement. The replacement may contain the special character &amp; to refer to that portion of the pattern space which matched, and the special escapes \\1 through \\9 to refer to the corresponding matching sub-expressions in the regexp.<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\"><strong>It looks to me like another \"programmer\" type dude NOT checking his work again...<br \/>\n<\/strong><\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">The subexpression \\1-9 mentioned there seems to be what he means, but he left out the backslashes! The g is for a global flag so the substitution affects all instances of the character in the addressed line.<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Now it works, but not exactly, as it seems to have removed a \"d\" from \"buddy\" also, but first, from no change with the missing slashes, to the desired outcome:<br \/>\n<\/span><\/p>\n<p><span style=\"color: #004dbb; font-size: 12pt;\">HPMint stevee # sed -ri 's\/(.*)1\/1\/g' \/abc.txt<br \/>\n<\/span><\/p>\n<p><span style=\"color: #004dbb; font-size: 12pt;\">HPMint stevee # cat \/abc.txt<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">This is is how it works buddy<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: red;\">What <strong>else else <\/strong>you want<\/span><span style=\"color: #004dbb;\"><br \/>\n<\/span><\/span><\/p>\n<p><span style=\"color: #004dbb; font-size: 12pt;\">HPMint stevee # sed -ri 's\/(.*)\\1\/\\1\/g' \/abc.txt<br \/>\n<\/span><\/p>\n<p><span style=\"color: #004dbb; font-size: 12pt;\">HPMint stevee # cat \/abc.txt<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">This is how it works budy<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: red;\">What <strong>else <\/strong>you want<\/span><span style=\"color: #004dbb;\"><br \/>\n<\/span><\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">A missing d also in buddy! I'll just repeat that to see if it does it again, with a ? to show it's my text:<br \/>\n<\/span><\/p>\n<p><span style=\"color: #004dbb; font-size: 12pt;\">HPMint stevee # cat \/abc.txt<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">This is is how it works buddy<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: red;\">What else else you want?<\/span><span style=\"color: black;\"><br \/>\n<\/span><\/span><\/p>\n<p><span style=\"color: #004dbb; font-size: 12pt;\">HPMint stevee # cat \/abc.txt<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">This is how it works budy<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">What else you want?<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Well, it removed one \"else\" and one \"is\", so I assume the \\1\\1 removes 1 of 2 repeated regular expressions\u00a0in ANY defined string on all lines due to the global \"g\".<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Seems he could have confined it to a line at a time, say, 2 addressing, from what's been learned so far - ah! like this! (Smartass!)<br \/>\n<\/span><\/p>\n<p><span style=\"color: #004dbb; font-size: 12pt;\">HPMint stevee # sed -ri '<strong>2<\/strong>s\/(.*)\\1\/\\1\/g' \/abc.txt<br \/>\n<\/span><\/p>\n<p><span style=\"color: #004dbb; font-size: 12pt;\">HPMint stevee # cat \/abc.txt<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">This is how it works bu<strong>d<\/strong>dy<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">What <strong>else <\/strong>you want?<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Still, his example was the best I found all day for this type of operation, until reading my book at home, and I have still not found anything this simple to remove numbers in a CSV, single line file, which is what I need here.<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">So, my nostalgic \u00a35 spent on Sumitabha Das' UNIX:DB book has paid off after all. I hope he gives a number substitution example.<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">If I need to change the commas to something else first, to get back to a column view so other tools like \"tr\" may work more easily with lines, another sed example for changing chars in all lines in a file, say a \";\" for a \",\" for the ports file \/badports.csv:<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-1956\" src=\"https:\/\/stevepedwards.today\/DebianAdmin\/wp-content\/uploads\/2015\/08\/badportsseparation2.jpg\" alt=\"badportsseparation2.jpg\" width=\"2548\" height=\"990\" \/><br \/>\n<\/span><span style=\"font-size: 12pt;\"><span style=\"color: #004dbb;\">sed 's\/,\/;\/g' \/badports.csv<\/span><span style=\"color: black;\"><br \/>\n<\/span><\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">this changes all commas to semicolons:<br \/>\n<\/span><\/p>\n<p><a href=\"https:\/\/stevepedwards.today\/DebianAdmin\/wp-content\/uploads\/2015\/08\/badportsseparation1.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-1955\" src=\"https:\/\/stevepedwards.today\/DebianAdmin\/wp-content\/uploads\/2015\/08\/badportsseparation1.jpg\" alt=\"badportsseparation1.jpg\" width=\"2550\" height=\"970\" \/><\/a><\/p>\n<p><span style=\"color: black; font-size: 12pt;\"><a href=\"https:\/\/stevepedwards.today\/DebianAdmin\/wp-content\/uploads\/2015\/08\/badportsseparation2.jpg\"><br \/>\n<\/a><\/span><span style=\"color: black; font-size: 12pt;\">Wish it was that easy for the dupe numbers...<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">So, I should be able to reverse the very first column to commas example above, by replacing all commas (or semicolons now) with newlines (\\n) using that eh?<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Yep!<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">65432<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">65432<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">65530<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">65535<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Sorted! As you might say...<br \/>\n<\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: black;\">Now that is back how it was originally in column format, does uniq work\u00a0to remove the dupe numbers? Seems so! My test dupe, <\/span><span style=\"color: #3333ff;\"><strong>61746 <\/strong><\/span>from above does not show twice if grepped.<span style=\"color: black;\"><br \/>\n<\/span><\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: black;\">HPMint stevee # <\/span><span style=\"color: #004dbb;\">cat \/badportscolumnreal.txt | uniq | grep 61746<\/span><span style=\"color: black;\"><br \/>\n<\/span><\/span><\/p>\n<p><span style=\"font-size: 12pt;\"><span style=\"color: red;\">61746<\/span><span style=\"color: black;\"><br \/>\n<\/span><\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Lets make the file, finally, with no dupe numbers:<br \/>\n<\/span><\/p>\n<p><span style=\"color: #004dbb; font-size: 12pt;\"> cat \/badportscolumnreal.txt | uniq &gt; \/badportscolumn.txt<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Check for only 1 port 80 too:<br \/>\n<\/span><\/p>\n<p><span style=\"color: #004dbb; font-size: 12pt;\">HPMint stevee # cat \/badportscolumn.txt | grep 80<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">80<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">680<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Finally, convert this back to a CSV, changing the awk line from above to reflect a single column:<br \/>\n<\/span><\/p>\n<p><span style=\"color: #004dbb; font-size: 12pt; background-color: white;\">\u00a0cat \/badportscolumn.txt | awk -vORS=, '{ print $1 }' | sed 's\/,$\/\\n\/'<br \/>\n<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Note this leaves a trailing comma:<br \/>\n<\/span><\/p>\n<p><span style=\"color: #ff0000; font-size: 12pt;\">65432,65530,65535,<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Once the extra blank last line I missed in the original file was fixed, there is no comma trail:<\/span><\/p>\n<p><span style=\"color: #ff0000; font-size: 12pt;\">65432,65530,65535<\/span><br \/>\n<span style=\"color: #ff0000; font-size: 12pt;\">HPMint stevee #<\/span> <span style=\"color: #0000ff; font-size: 12pt;\">cat \/badportscolumn.txt | awk -vORS=, '{ print $0 }' | sed 's\/,$\/\\n\/'<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">I'm still confused here (it doesn't take much!), as I thought the 2nd arg replaced the first in sed?\u00a0This appears that newlines should be replacing commas here, but they are not and it's working?<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Hmm - look at which stage does what in order - first cat, which gives the column listing:<\/span><\/p>\n<p><span style=\"color: #ff0000; font-size: 12pt;\">65432<\/span><br \/>\n<span style=\"color: #ff0000; font-size: 12pt;\">65530<\/span><br \/>\n<span style=\"color: #ff0000; font-size: 12pt;\">65535<\/span><\/p>\n<p><span style=\"color: #ff0000; font-size: 12pt;\">HPMint stevee #<\/span> <span style=\"color: #0000ff; font-size: 12pt;\">cat \/badportscolumn.txt<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Then with awk, you get the commas, still with one trailing, even though the original column file last blank line has been fixed\u00a0-\u00a0either $0 or $1 will do:<\/span><\/p>\n<p><span style=\"color: #ff0000; font-size: 12pt;\">65432,65530,65535,HPMint stevee #<\/span> <span style=\"color: #0000ff; font-size: 12pt;\">cat \/badportscolumn.txt | awk -vORS=, '{ print $1 }'<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">With sed added, it removes the\u00a0last comma - so that is where the new line is replacing the final comma ONLY:<\/span><\/p>\n<p><span style=\"color: #ff0000; font-size: 12pt;\">65000,65289,65421,65422,65432,65530,65535<\/span><br \/>\n<span style=\"color: #ff0000; font-size: 12pt;\">HPMint stevee #<\/span> <span style=\"color: #0000ff; font-size: 12pt;\">cat \/badportscolumn.txt | awk -vORS=, '{ print $1 }' | sed 's\/,$\/\\n\/'<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Could sed alone have done this column to commas substitution in the first place? It's not been easy to find how because of the different ways Carriage Returns and Line Feeds are handled, but this example placed a comma at each line end:<\/span><\/p>\n<p><a href=\"https:\/\/www.canbike.org\/information-technology\/sed-delete-carriage-returns-and-linefeeds-crlf.html\" target=\"_blank\" rel=\"noopener\">https:\/\/www.canbike.org\/information-technology\/sed-delete-carriage-returns-and-linefeeds-crlf.html<\/a><\/p>\n<p><span style=\"color: #0000ff; font-size: 12pt;\">sed 's\/$\/,\/g' \u00a0\/badportscolumn.txt<\/span><\/p>\n<p>So sed combined with cut can do all required from the start, except remove the final comma:<\/p>\n<p><span style=\"color: #0000ff;\">cut -d \" \" -f2 BadPortsBig.txt | sed \u00a0's\/$\/,\/g'<\/span><\/p>\n<p>Note this uses the BASH command line operator \"$\" to signify the \"end of line\" last character (as ^ is the first line char), i.e. just as both<\/p>\n<p><span style=\"color: #0000ff;\">grep \"^\" \/etc\/passwd<\/span><\/p>\n<p><span style=\"color: #0000ff;\">grep \"$\" \/etc\/passwd<\/span><\/p>\n<p>would print any whole file, as every line has to have a first and last character in it, so all lines would be found in both cases!<\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Anyway, this\u00a0still leaves the final comma problem. Not a real issue for my current task, as the port list has to be pasted to nmap anyway, as there is no easy way to include the port list after nmap's -p switch.<\/span><\/p>\n<p><span style=\"color: #ff0000; font-size: 12pt;\">65422,<\/span><br \/>\n<span style=\"color: #ff0000; font-size: 12pt;\">65432,<\/span><br \/>\n<span style=\"color: #ff0000; font-size: 12pt;\">65530,<\/span><br \/>\n<span style=\"color: #ff0000; font-size: 12pt;\">65535,<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">So, for pedantic completeness, the 2nd complex loop example works:<\/span><\/p>\n<p>where:<\/p>\n<pre><em><strong><code>:a  - create a label 'a'\r\nN   - append the next line to the pattern space\r\n$!  - if not the last line\r\nba  - branch (go to) label 'a'\r\ns   - substitute\r\n\/\\n\/     - regex for new line\r\n\/&lt;text&gt;\/ - with text \"&lt;text&gt;\"\r\ng   - global match (as many times as it can)<\/code><\/strong><\/em><\/pre>\n<pre><span style=\"color: #0000ff; font-size: 12pt;\">sed ':a;N;$!ba;s\/\\n\/,\/g' \/badportscolumn.txt<\/span><\/pre>\n<p><span style=\"color: #ff0000; font-size: 12pt;\">65422,65432,65530,65535<\/span><\/p>\n<p>In total then, from the first file, to last removed comma, use:<\/p>\n<p><span style=\"color: #0000ff;\">cut -d \" \" -f2 BadPortsBig.txt | sed ':a;N;$!ba;s\/\\n\/,\/g'<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">The filter\u00a0\u00a0\"tr\" can do the bulk it the most easily from the start.<br \/>\n<\/span><\/p>\n<p><span style=\"color: #3333ff; font-size: 12pt;\">man tr<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">NAME<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\"> tr - translate or delete characters<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">SYNOPSIS<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\"> tr [OPTION]... SET1 [SET2]<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\">DESCRIPTION<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\"> Translate, squeeze, and\/or delete characters from standard input, writ<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\"> ing to standard output.<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\"> -c, -C, --complement<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\"> use the complement of SET1<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\"> -d, --delete<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\"> delete characters in SET1, do not translate<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\"> -s, --squeeze-repeats<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\"> replace each input sequence of a repeated character that is<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\"> listed in SET1 with a single occurrence of that character<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\"> -t, --truncate-set1<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\"> first truncate SET1 to length of SET2<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\"> --help display this help and exit<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\"> --version<br \/>\n<\/span><\/p>\n<p><span style=\"color: red; font-size: 12pt;\"> output version information and exit<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">The format is as simple as can be, with the 2nd expression replacing the first:<\/span><\/p>\n<p><span style=\"color: #0000ff; font-size: 12pt;\">\u00a0cat \/badportscolumn.txt\u00a0<\/span><\/p>\n<p><span style=\"color: #ff0000; font-size: 12pt;\">65432<br \/>\n65530<br \/>\n65535<\/span><\/p>\n<p><span style=\"color: #0000ff; font-size: 12pt;\">tr '\\n' ',' &lt; \/badportscolumn.txt<\/span><\/p>\n<p><span style=\"color: #ff0000; font-size: 12pt;\">61747,61748,61979,62011,63485,64101,65000,65289,65421,65422,65432,65530,65535,<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Note the trailing comma, but how easy is that in one go?!<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">An instant column to CSV file conversion!<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">To save this output as a file you need to feed STDOUT\u00a0into text file as usual e.g :<\/span><\/p>\n<p><span style=\"color: #0000ff; font-size: 12pt;\">tr '\\n' ',' &lt; \/badportscolumn.txt &gt; \/badportsTR.txt<\/span><\/p>\n<p><span style=\"color: #0000ff; font-size: 12pt;\">cat \/badportsTR.txt<\/span><\/p>\n<p><span style=\"color: #ff0000; font-size: 12pt;\">65530,65535,HPMint stevee # cat \/badportsTR.txt<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Each first and second argument can have multiple chars separated by the | pipe in each apostrophe pair, each mapping to its position counterpart e.g. a to c and b to d, so c replaces a, and d replaces b :<\/span><\/p>\n<p><span style=\"color: #0000ff; font-size: 12pt;\"> cat \/TRfile.txt<\/span><br \/>\n<span style=\"color: #ff0000; font-size: 12pt;\">abaabbaaabbb<\/span><br \/>\n<span style=\"color: #ff0000; font-size: 12pt;\">cdccddcccdd<\/span><\/p>\n<p><span style=\"color: #0000ff; font-size: 12pt;\">tr 'a|b' 'c|d' &lt; \/TRfile.txt<\/span><br \/>\n<span style=\"color: #ff0000; font-size: 12pt;\">cdccddcccddd<\/span><br \/>\n<span style=\"color: #ff0000; font-size: 12pt;\">cdccddcccddd<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">You see all 2nd arguments - c and d - replaced all first arguments - a and b.\u00a0 <\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">Another example from the man page:<\/span><\/p>\n<p><span style=\"color: #ff0000; font-size: 12pt;\">-d, --delete<\/span><br \/>\n<span style=\"color: #ff0000; font-size: 12pt;\"> delete characters in SET1, do not translate<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">This should just remove the first line of a's and b's - only for 1 set at a time:<\/span><\/p>\n<p><span style=\"color: #0000ff; font-size: 12pt;\">cat \/TRfile.txt<\/span><br \/>\n<span style=\"color: #ff0000; font-size: 12pt;\">abaabbaaabbb<\/span><br \/>\n<span style=\"color: #ff0000; font-size: 12pt;\">cdccddcccddd<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">HPMint stevee # <span style=\"color: #0000ff; font-size: 12pt;\">tr -d 'a|b' &lt; \/TRfile.txt<\/span><\/span><\/p>\n<p><span style=\"color: #ff0000; font-size: 12pt;\">cdccddcccddd<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">or vice versa for 1 set only:<\/span><\/p>\n<p><span style=\"color: #0000ff; font-size: 12pt;\">tr -d 'c|d' &lt; \/TRfile.txt<\/span><br \/>\n<span style=\"color: #ff0000; font-size: 12pt;\">abaabbaaabbb<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">or more chars:<\/span><\/p>\n<p><span style=\"color: black; font-size: 12pt;\">HPMint stevee #<\/span><span style=\"color: #0000ff; font-size: 12pt;\">tr -d 'a|b|c' &lt; \/TRfile.txt<\/span><\/p>\n<p><span style=\"color: #ff0000; font-size: 12pt;\">dddddd<\/span><\/p>\n<p><strong>SUMMARY - the\u00a0one liner's from start to finish for this problem:<\/strong><\/p>\n<p><span style=\"color: #0000ff;\">cut -d \" \" -f2 badportsbig.txt | sed ':a;N;$!ba;s\/\\n\/,\/g'<\/span><\/p>\n<p><span style=\"color: #0000ff;\">awk '{print $2}' badportsbig.txt | sed ':a;N;$!ba;s\/\\n\/,\/g'<\/span><\/p>\n<p>And to scan the ports:<\/p>\n<p><span style=\"color: #0000ff;\">awk '{print $2}' badportsbig.txt | uniq &gt; ports.txt<\/span><\/p>\n<p><span style=\"color: #0000ff;\"> nmap -p $(echo `cat ports.txt`)<\/span><\/p>\n<p><center><\/center><center><iframe loading=\"lazy\" src=\"https:\/\/www.youtube.com\/embed\/uOMwPSJWL1g?autoplay=1&amp;version=3&amp;loop=1&amp;playlist=uOMwPSJWL1g\" width=\"560\" height=\"315\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/center><span style=\"color: black; font-size: 12pt;\">Filters are a MASSIVE subject, and effectively a programming language in themselves, so take a lot of time and practice to understand. Interesting though. You have to take your hat off to programmers, as much as I love to hate them at times.<\/span><\/p>\n<p><a href=\"https:\/\/stevepedwards.today\/DebianAdmin\/wp-content\/uploads\/2015\/08\/O_Reilly_-_sed___awk_2nd_Edition.pdf\">O_Reilly_-_sed___awk_2nd_Edition.pdf<\/a><\/p>\n<p>Awk as a programming tool using contents of {} as a \"script\":<\/p>\n<p><span style=\"color: #0000ff;\">awk -F: '{print $0 }' \/etc\/passwd<\/span><\/p>\n<p>Sed as an editor using a search and replace function:<\/p>\n<p><span style=\"color: #0000ff;\">sed 's\/:\/:\/' \/etc\/passwd<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<div class=\"pvc_clear\"><\/div>\n<p id=\"pvc_stats_1941\" class=\"pvc_stats all  \" data-element-id=\"1941\" style=\"\"><i class=\"pvc-stats-icon medium\" aria-hidden=\"true\"><svg aria-hidden=\"true\" focusable=\"false\" data-prefix=\"far\" data-icon=\"chart-bar\" role=\"img\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" viewBox=\"0 0 512 512\" class=\"svg-inline--fa fa-chart-bar fa-w-16 fa-2x\"><path fill=\"currentColor\" d=\"M396.8 352h22.4c6.4 0 12.8-6.4 12.8-12.8V108.8c0-6.4-6.4-12.8-12.8-12.8h-22.4c-6.4 0-12.8 6.4-12.8 12.8v230.4c0 6.4 6.4 12.8 12.8 12.8zm-192 0h22.4c6.4 0 12.8-6.4 12.8-12.8V140.8c0-6.4-6.4-12.8-12.8-12.8h-22.4c-6.4 0-12.8 6.4-12.8 12.8v198.4c0 6.4 6.4 12.8 12.8 12.8zm96 0h22.4c6.4 0 12.8-6.4 12.8-12.8V204.8c0-6.4-6.4-12.8-12.8-12.8h-22.4c-6.4 0-12.8 6.4-12.8 12.8v134.4c0 6.4 6.4 12.8 12.8 12.8zM496 400H48V80c0-8.84-7.16-16-16-16H16C7.16 64 0 71.16 0 80v336c0 17.67 14.33 32 32 32h464c8.84 0 16-7.16 16-16v-16c0-8.84-7.16-16-16-16zm-387.2-48h22.4c6.4 0 12.8-6.4 12.8-12.8v-70.4c0-6.4-6.4-12.8-12.8-12.8h-22.4c-6.4 0-12.8 6.4-12.8 12.8v70.4c0 6.4 6.4 12.8 12.8 12.8z\" class=\"\"><\/path><\/svg><\/i> <img loading=\"lazy\" decoding=\"async\" width=\"16\" height=\"16\" alt=\"Loading\" src=\"https:\/\/stevepedwards.today\/DebianAdmin\/wp-content\/plugins\/page-views-count\/ajax-loader-2x.gif\" border=0 \/><\/p>\n<div class=\"pvc_clear\"><\/div>\n<p>*BEWARE! a \u00a0note on terminal characters before you start: apostrophe pairs can be problematic for copy\/paste operations between different editors, term types, WordPress etc. The common standard is now UTF8, so make sure you set PuTTY correctly using the Translation option: This may still paste one of the two apostrophes in the wrong \"direction\" and <a href=\"https:\/\/stevepedwards.today\/DebianAdmin\/using-awk-and-sed-to-cut-a-column-list-and-for-character-substitution\/\" class=\"more-link\">...<span class=\"screen-reader-text\">\u00a0 Using Awk, Sed, Cut and TR To Cut a Column List for Character Substitution and Nmap Bad Ports List<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-1941","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"a3_pvc":{"activated":true,"total_views":2,"today_views":0},"_links":{"self":[{"href":"https:\/\/stevepedwards.today\/DebianAdmin\/wp-json\/wp\/v2\/posts\/1941","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/stevepedwards.today\/DebianAdmin\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/stevepedwards.today\/DebianAdmin\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/stevepedwards.today\/DebianAdmin\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/stevepedwards.today\/DebianAdmin\/wp-json\/wp\/v2\/comments?post=1941"}],"version-history":[{"count":5,"href":"https:\/\/stevepedwards.today\/DebianAdmin\/wp-json\/wp\/v2\/posts\/1941\/revisions"}],"predecessor-version":[{"id":10023,"href":"https:\/\/stevepedwards.today\/DebianAdmin\/wp-json\/wp\/v2\/posts\/1941\/revisions\/10023"}],"wp:attachment":[{"href":"https:\/\/stevepedwards.today\/DebianAdmin\/wp-json\/wp\/v2\/media?parent=1941"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/stevepedwards.today\/DebianAdmin\/wp-json\/wp\/v2\/categories?post=1941"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/stevepedwards.today\/DebianAdmin\/wp-json\/wp\/v2\/tags?post=1941"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}