57,99 €
Object-Oriented scripting with Perl and Python Scripting languages are becoming increasingly important for software development. These higher-level languages, with their built-in easy-to-use data structures are convenient for programmers to use as "glue" languages for assembling multi-language applications and for quick prototyping of software architectures. Scripting languages are also used extensively in Web-based applications. Based on the same overall philosophy that made Programming with Objects such a wide success, Scripting with Objects takes a novel dual-language approach to learning advanced scripting with Perl and Python, the dominant languages of the genre. This method of comparing basic syntax and writing application-level scripts is designed to give readers a more comprehensive and expansive perspective on the subject. Beginning with an overview of the importance of scripting languages--and how they differ from mainstream systems programming languages--the book explores: * Regular expressions for string processing * The notion of a class in Perl and Python * Inheritance and polymorphism in Perl and Python * Handling exceptions * Abstract classes and methods in Perl and Python * Weak references for memory management * Scripting for graphical user interfaces * Multithreaded scripting * Scripting for network programming * Interacting with databases * Processing XML with Perl and Python This book serves as an excellent textbook for a one-semester undergraduate course on advanced scripting in which the students have some prior experience using Perl and Python, or for a two-semester course for students who will be experiencing scripting for the first time. Scripting with Objects is also an ideal resource for industry professionals who are making the transition from Perl to Python, or vice versa.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 1861
Veröffentlichungsjahr: 2017
Cover
Title
Copyright
Dedication
Preface
Acknowledgments
1 Multilanguage View of Application Development and OO Scripting
1.1 SCRIPTING LANGUAGES VERSUS SYSTEMS PROGRAMMING LANGUAGES
1.2 ORGANIZATION OF THIS BOOK
1.3 CREDITS AND SUGGESTIONS FOR FURTHER READING
2 Perl — A Review of the Basics
2.1 SCALAR VALUES IN PERL
2.2 PERUS VARIABLES: SCALARS, ARRAYS, AND HASHES
2.3 LEXICAL SCOPE, LEXICAL VARIABLES, AND GLOBAL VARIABLES
2.4 DISPLAYING ARRAYS
2.5 DISPLAYING HASHES
2.6 TERMINAL AND FILE I/O
2.7 FUNCTIONS, SUBROUTINES, AND FUNCTIONS USED AS OPERATORS
2.8 WHAT IS RETURNED BY EVALUATION DEPENDS ON CONTEXT
2.9 CONDITIONAL EVALUATION AND LOOP CONTROL STRUCTURES
2.10 FUNCTIONS SUPPLIED WITH HERE-DOC ARGUMENTS
2.11 MODULES AND PACKAGES IN PERL
2.12 TEMPORARILY LOCALIZING A GLOBAL VARIABLE
2.13 TYPEGLOBS FOR GLOBAL NAMES
2.14 THE eval OPERATOR
2.15
grep()
AND
map()
FUNCTIONS
2.16 INTERACTING WITH THE DIRECTORY STRUCTURE
2.17 LAUNCHING PROCESSES
2.18 SENDING AND TRAPPING SIGNALS
2.19 CREDITS AND SUGGESTIONS FOR FURTHER READING
2.20 HOMEWORK
3 Python — A Review of the Basics
3.1 LANGUAGE MODEL: PERL VERSUS PYTHON
3.2 NUMBERS
3.3 PYTHON CONTAINERS: SEQUENCES
3.4 PYTHON CONTAINERS: DICTIONARIES
3.5 BUILT-IN TYPES AS CLASSES
3.6 SUBCLASSING THE BUILT-IN TYPES
3.7 TERMINAL AND FILE I/O
3.8 USER-DEFINED FUNCTIONS
3.9 CONTROL STRUCTURES
3.10 MODULES IN PYTHON
3.11 SCOPING RULES, NAMESPACES, AND NAME RESOLUTION
3.12 THE
eval()
FUNCTION
3.13
map()
AND
filter()
FUNCTIONS
3.14 INTERACTING WITH THE DIRECTORY STRUCTURE
3.15 LAUNCHING PROCESSES
3.16 SENDING AND TRAPPING SIGNALS
3.17 CREDITS AND SUGGESTIONS FOR FURTHER READING
3.18 HOMEWORK
4 Regular Expressions for String Processing
4.1 WHAT IS AN INPUT STRING?
4.2 SIMPLE SUBSTRING SEARCH
4.3 WHAT IS MEANT BY A MATCH BETWEEN A REGEX AND AN INPUT STRING?
4.4 REGEX MATCHING AT LINE AND WORD BOUNDARIES
4.5 CHARACTER CLASSES FOR REGEX MATCHING
4.6 SPECIFYING ALTERNATIVES IN A REGEX
4.7 SUBEXPRESSION OF A REGEX
4.8 EXTRACTING SUBSTRINGS FROM AN INPUT STRING
4.9 ABBREVIATED NOTATION FOR CHARACTER CLASSES
4.10 QUANTIFIER METACHARACTERS
4.11 MATCH MODIFIERS
4.12 SPLITTING STRINGS
4.13 REGEXES FOR SEARCH AND REPLACE OPERATIONS
4.14 CREDITS AND SUGGESTIONS FOR FURTHER READING
4.15 HOMEWORK
5 References in Perl
5.1 REFERENCING AND DEREFERENCING OPERATORS (SUMMARY)
5.2 REFERENCING AND DEREFERENCING A SCALAR
5.3 REFERENCING AND DEREFERENCING A NAMED ARRAY
5.4 REFERENCING AND DEREFERENCING AN ANONYMOUS ARRAY
5.5 REFERENCING AND DEREFERENCING A NAMED HASH
5.6 REFERENCING AND DEREFERENCING AN ANONYMOUS HASH
5.7 REFERENCING AND DEREFERENCING A NAMED SUBROUTINE
5.8 REFERENCING AND DEREFERENCING AN ANONYMOUS SUBROUTINE
5.9 SUBROUTINES RETURNING REFERENCES TO SUBROUTINES
5.10 CLOSURES
5.11 ENFORCING PRIVACY IN MODULES
5.12 REFERENCES TO TYPEGLOBS
5.13 THE ref() FUNCTION
5.14 SYMBOLIC REFERENCES
5.15 CREDITS AND SUGGESTIONS FOR FURTHER READING
5.16 HOMEWORK
6 The Notion of a Class in Perl
6.1 DEFINING A CLASS IN PERL
6.2 CONSTRUCTORS CAN BE CALLED WITH KEYWORD ARGUMENTS
6.3 DEFAULT VALUES FOR INSTANCE VARIABLES
6.4 INSTANCE OBJECT DESTRUCTION
6.5 CONTROLLING THE INTERACTION BETWEEN
DESTROY()
AND
AUTOLOAD()
6.6 CLASS DATA AND METHODS
6.7 REBLESSING OBJECTS
6.8 OPERATOR OVERLOADING AND CLASS CUSTOMIZATION
6.9 CREDITS AND SUGGESTIONS FOR FURTHER READING
6.10 HOMEWORK
7 The Notion of a Class in Python
7.1 DEFINING A CLASS IN PYTHON
7.2 NEW-STYLE VERSUS CLASSIC CLASSES IN PYTHON
7.3 DEFINING METHODS
7.4 DESTRUCTION OF INSTANCE OBJECTS
7.5 ENCAPSULATION ISSUES FOR CLASSES
7.6 DEFINING CLASS VARIABLES, STATIC METHODS, AND CLASS METHODS
7.7 PRIVATE DATA ATTRIBUTES AND METHODS
7.8 DEFINING A CLASS WITH SLOTS
7.9 DESCRIPTOR CLASSES IN PYTHON
7.10 OPERATOR OVERLOADING AND CLASS CUSTOMIZATION
7.11 CREDITS AND SUGGESTIONS FOR FURTHER READING
7.12 HOMEWORK
8 Inheritance and Polymorphism in Perl
8.1 INHERITANCE IN MAINSTREAM OO
8.2 INHERITANCE AND POLYMORPHISM IN PERL: COMPARISON WITH MAINSTREAM OO LANGUAGES
8.3 THE ISA ARRAY FOR SPECIFYING THE PARENTS OF A CLASS
8.4 AN EXAMPLE OF CLASS DERIVATION IN PERL
8.5 A SMALL DEMONSTRATION OF POLYMORPHISM IN PERL OO
8.6 HOW A DERIVED-CLASS METHOD CALLS ON A BASE-CLASS METHOD
8.7 THE
UNIVERSAL
CLASS
8.8 HOW A METHOD IS SEARCHED FOR IN A CLASS HIERARCHY
8.9 INHERITED METHODS BEHAVE AS IF LOCALLY DEFINED
8.10 DESTRUCTION OF DERIVED-CLASS INSTANCES
8.11 DIAMOND INHERITANCE
8.12 ON THE INHERITABILITY OF A CLASS
8.13 LOCAL VARIABLES AND SUBROUTINES IN DERIVED CLASSES
8.14 OPERATOR OVERLOADING AND INHERITANCE
8.15 CREDITS AND SUGGESTIONS FOR FURTHER READING
8.16 HOMEWORK
9 Inheritance and Polymorphism in Python
9.1 EXTENDING A CLASS IN PYTHON
9.2 EXTENDING A BASE-CLASS METHOD IN A SINGLE-INHERITANCE CHAIN
9.3 A SIMPLE DEMONSTRATION OF POLYMORPHISM IN PYTHON OO
9.4 DESTRUCTION OF DERIVED-CLASS INSTANCES IN SINGLE-INHERITANCE CHAINS
9.5 THE ROOT CLASS object
9.6 SUBCLASSING FROM THE BUILT-IN TYPES
9.7 ON OVERRIDING
--
new
--
() AND
--
init
--
()
9.8 MULTIPLE INHERITANCE
9.9 USING
super()
TO CALL A BASE-CLASS METHOD
9.10 METACLASSES IN PYTHON
9.11 OPERATOR OVERLOADING AND INHERITANCE
9.12 CREDITS AND SUGGESTIONS FOR FURTHER READING
9.13 HOMEWORK
10 Handling Exceptions
10.1 REVIEW OF die FOR PROGRAM EXIT IN PERL
10.2 eval FOR EXCEPTION HANDLING IN PERL
10.3 USING THE Exception MODULE FOR EXCEPTION HANDLING IN PERL
10.4 EXCEPTION HANDLING IN PYTHON
10.5 PYTHON'S BUILT-IN EXCEPTION CLASSES
10.6 CREDITS AND SUGGESTIONS FOR FURTHER READING
10.7 HOMEWORK
11 Abstract Classes and Methods
11.1 ABSTRACT CLASSES AND METHODS IN PERL
11.2 ABSTRACT CLASSES IN PYTHON
11.3 CREDITS AND SUGGESTIONS FOR FURTHER READING
11.4 HOMEWORK
12 Weak References for Memory Management
12.1 A BRIEF REVIEW OF MEMORY MANAGEMENT
12.2 GARBAGE COLLECTION FOR MEMORY-INTENSIVE APPLICATIONS
12.3 THE CIRCULAR REFERENCE PROBLEM
12.4 WEAK VERSUS STRONG REFERENCES
12.5 WEAK REFERENCES IN PERL
12.6 WEAK REFERENCES IN PYTHON
12.7 CREDITS AND SUGGESTIONS FOR FURTHER READING
12.8 HOMEWORK
13 Scripting for Graphical User Interfaces
13.1 THE WIDGET LIBRARIES
13.2 MINIMALIST GUI SCRIPTS
13.3 GEOMETRY MANAGERS FOR AUTOMATIC LAYOUT
13.4 EVENT PROCESSING
13.5 WIDGETS INVOKING CALLBACKS ON OTHER WIDGETS
13.6 MENUS
13.7 A PHOTO ALBUM VIEWER
13.8 CREDITS AND SUGGESTIONS FOR FURTHER READING
13.9 HOMEWORK
14 Multithreaded Scripting
14.1 BASIC MULTITHREADING IN PERL
14.2 BASIC MULTITHREADING IN PYTHON
14.3 THREAD COOPERATION WITH sleep()
14.4 DATA SHARING BETWEEN THREADS AND THREAD INTERFERENCE
14.5 SUPPRESSING THREAD INTERFERENCE WITH LOCKS
14.6 USING SEMAPHORES FOR ELIMINATING THREAD INTERFERENCE
14.7 USING CONDITION VARIABLES FOR AVOIDING DEADLOCK
14.8 CREDITS AND SUGGESTIONS FOR FURTHER READING
14.9 HOMEWORK
15 Scripting for Network Programming
15.1 SOCKETS
15.2 CLIENT-SIDE SOCKETS FOR FETCHING DOCUMENTS
15.3 CLIENT-SIDE SCRIPTING FOR INTERACTIVE SESSION WITH A SERVER
15.4 SERVER SOCKETS
15.5 ACCEPT-AND-FORK SERVERS
15.6 PREFORKING SERVERS
15.7 MULTIPLEXED CLIENTS AND SERVERS
15.8 UDP SERVERS AND CLIENTS
15.9 BROADCASTING WITH UDP
15.10 MULTICASTING WITH UDP
15.11 CREDITS AND SUGGESTIONS FOR FURTHER READING
15.12 HOMEWORK
16 Interacting with Databases
16.1 FLAT-FILE DATABASES: WORKING WITH CSV FILES
16.2 FLAT-FILE DATABASES: WORKING WITH FIXED-LENGTH RECORDS
16.3 DATABASES STORED AS DISK-BASED HASH TABLES
16.4 USING DBMS THROUGH THE TIE MECHANISM IN PERL
16.5 SERIALIZATION OF COMPLEX DATA STRUCTURES FOR PERSISTENCE
16.6 RELATIONAL DATABASES
16.7 PERL FOR INTERACTING WITH RELATIONAL DATABASES
16.8 PYTHON FOR INTERACTING WITH RELATIONAL DATABASES
16.9 CREDITS AND SUGGESTIONS FOR FURTHER READING
16.10 HOMEWORK
17 Processing XML with Perl and Python
17.1 CREATING AN XML DOCUMENT
17.2 EXTRACTING INFORMATION FROM SIMPLE XML DOCUMENTS
17.3 XML NAMESPACES
17.4 PARTITIONING AN XML DOCUMENT INTO ITS CONSTITUENT PARTS
17.5 DOCUMENT TYPE DEFINITIONS FOR XML DOCUMENTS
17.6 XML SCHEMAS
17.7 PARSING XML DOCUMENTS
17.8 XML FOR WEB SERVICES
17.9 XSL FOR TRANSFORMING XML
17.10 CREDITS AND SUGGESTIONS FOR FURTHER READING
17.11 HOMEWORK
References
Index
End User License Agreement
3 Python — A Review of the Basics
Table 3.1
Table 3.2
4 Regular Expressions for String Processing
Table 4.1
17 Processing XML with Perl and Python
Table 17.1
Table 17.2
Preface
Fig. 0.1 For a One-Semester Undergraduate Course on Advanced Scripting
Fig. 0.2 For a Two-Semester Program of Courses Devoted to Scripting
1 Multilanguage View of Application Development and OO Scripting
Fig. 1.1 A comparison of various systems programming languages, scripting languages, and assembly languages with respect to two criteria: the degree of typing required by each language and the average number of machine instructions per statement of the language. (From Ousterhout [51].)
4 Regular Expressions for String Processing
Fig. 4.1
6 The Notion of a Class in Perl
Fig. 6.1
8 Inheritance and Polymorphism in Perl
Fig. 8.1
Fig. 8.2
Fig. 8.3
Fig. 8.4
Fig. 8.5
Fig. 8.6
9 Inheritance and Polymorphism in Python
Fig. 9.1
Fig. 9.2
Fig. 9.3
Fig. 9.4
Fig. 9.5
Fig. 9.6
Fig. 9.7
Fig. 9.8
Fig. 9.9
11 Abstract Classes and Methods
Fig. 11.1
Fig. 11.2
13 Scripting for Graphical User Interfaces
Fig. 13.1
Fig. 13.2
Fig. 13.3
Fig. 13.4
Fig. 13.5
Fig. 13.6
Fig. 13.7
Fig. 13.8
Fig. 13.9
Fig. 13.10
Fig. 13.11
Cover
Table of Contents
Begin Reading
C1
iii
iv
v
vi
xxvii
xxviii
xxix
xxx
xxxi
xxxiii
xxxiv
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
e1
Avinash C. Kak
Purdue University
Copyright © 2008 by John Wiley & Sons, Inc. All rights reserved.
Published by John Wiley & Sons, Inc., Hoboken, New Jersey.
Published simultaneously in Canada.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic format. For information about Wiley products, visit our web site at www.wiley.com.
Library of Congress Cataloging-in-Publication Data:
Kak, Avinash C.
Scripting with objects : a comparative presentation of object-oriented scripting with Perl and Python / Avinash C. Kak.
p. cm.
ISBN 978-0-470-17923-9
1. Object-oriented programming (Computer science) 2. Scripting languages (Computer science) 3. Perl (Computer program language) 4. Python (Computer program language) I. Title.
QA76.64.K3555 2008
005.1’17—dc22
2007035480
To my daughter Carina
During the last several years, scripting languages have become just as important as systems programming languages. Thus it felt natural to follow my Programming with Objects (PWO) with Scripting with Objects. Programming with Objects was a comparative presentation of C++ and Java, two dominant languages of that genre. In the same vein, Scripting with Objects is a comparative presentation of Perl and Python, two dominant languages of the genre that these languages represent.
Scripting with Objects is based on the same overall philosophy as PWO, that, in addition to its syntax, it is essential to depict a programming language through its applications in order to establish fully its beauty and power. Teaching a programming language divorced from its applications would be akin to teaching English through its grammar alone.
This book is not intended for a reader who wants to acquire quickly the rudiments of a scripting language for solving some problem that the he or she has in mind. Rather, this book is designed for a reader who desires to acquire a more comprehensive and expansive perspective on scripting by simultaneous exposure to two languages.
There is an adage that says that you can never understand one language until you understand at least two.1 This adage goes straight to the heart of becoming a good programmer these days and developing a good sense of the range of programming ideas that are out there. While learning a single language will equip you with the syntax, it will not necessarily give you an appreciation for all of the nuances associated with the usage of that syntax. Learning two languages (and, if at all possible, multiple languages) that are at once similar and different is perhaps the most effective way to appreciate those nuances. In the old days, the more serious students of English literature would often learn Latin as well as a number of other Romance languages such as French, Italian, and Spanish. This was certainly true of all of the noted British poets, novelists, and playwrights of the eighteenth and nineteenth centuries. Learning multiple languages gave them the ability to create literary and linguistic effects in their native tongue that otherwise would not have been possible.
However, learning two languages at the same time does create its own challenges, especially when the languages are as large as Perl and Python. A beginner can easily become confused as to which syntactical construct belongs to what language. Additionally, certain habits one quickly acquires in one language (such as terminating every Perl statement with a semicolon and using just indentations for indicating block termination in Python) can reflexively manifest themselves in the other language. With regard to the first difficulty, note that practically all modern programming requires that you keep the documentation page open in one window as you are programming in another window. All of the major modern languages have become so large that it is impossible to commit to memory all of the functionality in all the modules of a language. So, if you really think about it, working with two programming languages in this day and age should be no different than working with one programming language. In either case, you will be looking up the documentation as you are writing your program or script. The second issue — the data entry habits of one language interfering with the writing of programs in the other language — is a more significant problem, but one that can nevertheless be overcome with practice.
This book’s treatment of Perl and Python is mostly comparative, all the way from the basic syntax of the two languages to writing application-level scripts. By comparative, I mean that similar scripting concepts in the two languages are explained with identical examples in most cases. In many chapters, I have followed a Perl presentation of a scripting concept with a Python presentation of the same concept. In other chapters, it seemed more appropriate to present first all of the Perl-related notions, followed by the corresponding Python-based notions.
The book begins in Chapter 1 with the presentation of a multilanguage view of modern application development and the importance of scripting languages (and their differences from the mainstream systems programming languages). The basic Perl syntax is reviewed in Chapter 2, and the basic Python syntax in Chapter 3. Chapter 4 then provides a review of regular expressions and how to use them in Perl and Python. Chapter 5 presents the notion of a reference in Perl. These five chapters constitute the basic scripting material in the book.
With Chapter 6, we begin the presentation of object-oriented scripting, the focus of that chapter being on the notion of a class in Perl. Chapter 7 then follows with a similar discussion but in Python; this chapter also introduces the reader to the new-style classes in Python. Chapter 8 discusses inheritance and polymorphism in Perl, with Chapter 9 doing the same for Python. Chapter 10 is devoted to the topic of exceptions in Perl and Python. Throwing and catching exceptions have become central to object-oriented programming and scripting, so much so that now you see them being used even for affecting certain forms of flow of control. Chapter 11 shows how to define abstract classes and methods in Perl and Python. Abstract classes and methods play a very important role in object-oriented programming in general. They can be used as mixin classes to lend specialized behaviors to other classes and, by serving as root classes, they can help to unify the other classes into meaningful hierarchies. Chapter 12 goes into the issues that deal with memory management and garbage collection.
Application-level scripting starts with Chapter 13 where we take up the subject of writing scripts for creating graphical user interfaces. Chapter 14 presents multithreaded scripting. Multithreading plays an important role when scripts are written for graphical user interfaces and network applications. We take up the topic of network programming with Perl and Python in Chapter 15. Chapter 16 shows how to write scripts for interacting with databases. Finally, Chapter 17 presents scripting for processing XML. This chapter is one of the longest chapters in the book because the world of XML has literally exploded during the last few years. In addition to becoming the technology of choice for representing document content in a manner that is independent of how the document is displayed, XML is also emerging as the lingua franca for the interoperability needed for the rapidly emerging business of web services.
For a more in-depth description of each chapter, see Section 1.2 of Chapter 1.
The book is designed both for a one-semester undergraduate course on advanced scripting (this would be for students with prior background in the basics of Perl and Python), and a two-semester undergraduate program for students who will be experiencing scripting for the first time. When used in a one-semester advanced scripting course, the recommended starting point in the book would be Chapter 6. For such a course, the first five chapters are intended to serve as basic reference material. When used for a two-semester educational program in scripting, my recommendation would be to cover roughly the first 500 pages in the first semester and the rest in the second semester.
Figure 0.1 is a recommendation for the use of the book for a one-semester undergraduate course on advanced scripting. Note that there are definite advantages to having all of the reference material needed for advanced instruction in the text itself. Chapter 12 is optional and depends on the degree to which the instructor wants to emphasize memory management issues. After Chapter 12, all of the application chapters can be taught in any order. In longer chapters, such as Chapter 17 on XML, the instructors may wish to only address the material they would like to emphasize.
Fig. 0.1For a One-Semester Undergraduate Course on Advanced Scripting
Figure 0.2 outlines the recommended use of the book for a two-semester program of instruction in scripting. I believe that a two-semester program would result in a quicker rate of learning, thus making it possible to accommodate a sophisticated project in the latter half of the second semester.
I will be glad to share my teaching materials with the prospective instructors planning to use this book as a text. I would also appreciate hearing from the readers about any typos and other errors they may find. All corrections received will be posted at www.scripting-with-objects.com and the authors of the corrections duly acknowledged. The corrections and other feedback can be sent to me directly at [email protected]. I would also love to hear from the readers if I have slipped up in making proper attributions to other authors. When an example script in this book was inspired by some material I saw elsewhere, I have acknowledged the author of that material in a footnote or in the "Credits and Suggestions for Further Reading" section at the end of the chapter.
Fig. 0.2For a Two-Semester Program of Courses Devoted to Scripting
Before ending, I would also like to add that this book should be useful for those who are making a transition from Perl to Python or vice versa.
Avinash C. KakWest Lafayette, Indiana
1
Attributed to Ronald Searle, artist (1920- ).
I’d like to thank my editor George Telecki at John Wiley whose experience and expertise have guided me through yet another book. George is the quintessential editor who is as much in tune with the human side of what it takes to write and produce a book as with the scientific and technological dimensions of the work involved.
Many of the example scripts and the explanations presented in this book evolved through my teaching object-oriented scripting and computer security courses at Purdue. The students who sat through these classes deserve special mention for providing valuable feedback.
Several of the core object-oriented ideas presented here for Perl and Python were refined though a tutorial given at the 2006 Open Source Convention; many thanks to the organizers ofthat conference for giving me the opportunity. I also owe thanks to the tutorial participants who endured my experiments with a comparative presentation of two languages in a time-limited format.
Special thanks to Malcolm Slaney, now a Principal Scientist at Yahoo Research, for his feedback on several of the chapters. You could call Malcolm a natural-born “computationalist.” He is a well of knowledge whose waters run deep and wide.
It is a challenge to ensure that a large volume such as this is as error-free as possible. Much help toward that end was provided by John Mastarone who was brave enough to wade through the entire manuscript. Copy-editing help was also provided by others who looked at various sections of the manuscript. I owe all my gratitude. Obviously, whatever errors remain, typographical or otherwise, are mine to fix in hopefully future printings of the book.
Special thanks to Amy Hendrickson, Wiley's Latex expert, for her help with the formatting of the book.
Finally, my deepest and most pleasurable thanks go to Stacey for her loving support of an exhausting project that we both thought was never going to end. I also owe her much for contributing her language skills to the smoothing out of the text at many places in the book.
A. C. K.
We now live in a world of multilanguage computing in which two or more languages may be used simultaneously in an application development effort.
Increasingly, application development efforts are thought of as exercises in high-level integration over components that may be independent. Often each component is programmed using a systems programming language and these components are integrated using a scripting language. Although systems programming languages like C and C++ provide type safety,1 speed, access to existing libraries, fast bittwiddling capabilities, and so on, scripting languages such as Perl and Python allow for rapid application-level prototyping, easier task-level reconfigurability, automatic report generation, easy-to-use interfaces to built-in high-level data structures such as lists, arrays, and hashes for analysis and documentation of task-level performance, and so forth.
What is important is that at both ends — the component end and the integration end — more and more software development is being carried out using object-oriented concepts. The reasons for this are not surprising. The software needs that are driving the object-oriented (OO) movement in the systems programming languages are the same as the needs for scripting languages: code extensibility, code reusability, code modularization, easier maintenance of the code, and so forth. As software becomes increasingly complex, whether for the programming of the individual components or for systems integration, it cries out for solutions that in many cases are provided by OO.
In other words, the fundamental notions of OO — encapsulation, inheritance, and polymorphism, exception handling, and so on — are just as useful for scripting languages as they are for systems programming languages. Since much of the evolution of OO programming took place in the realm of systems programming languages, the aforementioned fundamental concepts of OO are widely associated with those languages. This association is reinforced by the existence of hundreds of books that deal with OO for systems programming languages. Less well known is the fact that, in recent years, OO has become equally central to scripting languages.
You can, of course, get considerable mileage from scripting languages without resorting to the OO style of programming.2 The large number of non-OO-based Perl and Python scripts available freely on the internet bears a testimony to that. But the fact remains that much of today’s commercial-grade software in both of these languages is based on OO. Additionally, in Python, even if one does not directly use the concepts of subclassing, inheritance, and polymorphism in a script, the language uses the OO style of function calls for practically everything. That is, you invoke functions on objects, even when the objects are not instances constructed from a class, as opposed to calling functions with object arguments.3 In Perl also, even when you choose not to use the concepts of subclassing, inheritance, and polymorphism, you may nonetheless run headlong into OO when you make use of language features such as tying a variable to a disk-based database file. The simple act of assigning a value to such a variable can alter the database file in any desired manner, including automatically storing the value of the variable on the disk.
Scripting languages did not start out with object-oriented features. Originally, their main purpose was to serve as tools for automating repetitive tasks in system administration. You would, for example, write a small shell script for rotating the log files in a system; this script would run automatically and periodically at certain times (under cron in Unix environments). But over the years, as the languages evolved to incorporate easy-to-use facilities for graphical user interface (GUI) programming, network programming, interfacing with database managers, and so on, the language developers resorted to object orientation.
Many people today would disagree with the dichotomy suggested by the title of this section, especially if we are to place Perl and Python in the category of scripting languages.
Scripting languages were once purely interpreted. When purely interpreted, code is not converted into a machine-dependent binary executable file in one fell swoop. Instead, each statement, consisting of calls either to other programs or to functions provided specially by the interpreter, is first interpreted and then executed one at a time. An important part of interpretation is the determination of the storage locations of the identifiers in the statements. This determination is generally carried out afresh for each separate statement, commonly causing the interpreted code to run slower than the compiled code. Languages that are purely interpreted usually do not contain facilities for constructing arbitrarily complex data structures.
Yes, Perl and Python are not purely interpreted languages in the sense that is described above. Over the years, both have become large full-blown languages, with practically all the features that one finds in systems programming languages. Additionally, both have a compilation stage, meaning that a script is first compiled and then executed.4 The compilation stage checks each statement for syntax accuracy and outputs first an abstract syntax tree representation of the script and then, from that, a bytecode file that is platform independent.5 The bytecode is subsequently interpreted by a virtual machine. If necessary, it is possible to compile a Perl or Python script directly into a machine-dependent binary executable file — just as one would do with a C program — and then run the executable file separately, again just as you would execute an a. out or a. exe file for C. The main advantage of first converting a script into an abstract syntax tree and then interpreting the bytecode is that it allows for the intermixing of the compilation and interpretation stages. That is, inside a script you can have a string that is actually another script. When this string is passed as an argument to an evaluation function, the function compiles and executes the argument at run time. Such evaluation functions are frequently called eval.
Therefore, we obviously cannot say that Perl and Python are interpreted languages in the old sense of what used to be meant by “interpreted languages.” Despite that, we obviously cannot lump languages like C and C++ in the same category as languages like Perl and Python. What we have now is an interpreted-to-compiled continuum in which languages like the various Unix shells, AppleScript, MSDOS batch files, and so on, belong at the purely interpreted end and languages like C and C++ belong at the purely compiled end. Other languages like Perl, Python, Lisp, Java, and so on occupy various positions in this continuum.
While compilation versus interpretation may not be a sound criterion to set Perl and Python apart from the systems programming languages like C and C++, there are other criteria that are more telling. We will present these in the rest of this section. The material that follows in this section draws heavily from (and sometimes quotes verbatim from) an article by Ousterhout, creator of the Tcl scripting language [51].
Closeness to the machine:
To achieve the highest possible efficiencies in data access, data manipulation, and the algorithms used for searching, sorting, decision making, and so on, a systems programming language usually sits closer to the machine than a scripting language.
Purpose:
Starting from the most primitive computer element — a word of memory — a systems programming language lets you build custom data structures from scratch and then lets you create computationally efficient implementations for algorithms that require fast numerical or combinatorial manipulation of the data elements. On the other hand, a scripting language is designed for gluing together more task-focused components that may be written using systems programming languages. Additionally, the components utilized in the same script may not all be written using the same systems programming language. So an important attribute of a scripting language is the ease with which it allows interconnections between the components written in other languages — often systems programming languages.
Strongness of data typing:
Fundamentally speaking, the notion of a data type is not inherent to a computer. Any word of memory can hold any type of data, such as an integer, a floating-point value, a memory address, or even an instruction. Nevertheless, systems programming languages are strongly typed in general. When you declare the type of a variable in the source code, you are telling the compiler that the variable will have certain storage and run-time properties. The compiler uses this information to detect errors in the source code and to generate a more computationally efficient executable than would otherwise be the case. As an example of compile-time error checking on the basis of types, the compiler would complain if in C you declare a certain variable to be an integer and then proceed to use it as a pointer. Similarly, if you declare a certain variable to be of type
double
in Java and then pass it as an argument to a function for a parameter of type
int
, the compiler would again complain because of possible loss of precision. Using type declarations to catch errors at compile time results in a more dependable product as such errors are caught before the product is shipped out the door. On the other hand, a run-time error would only invite the wrath of the user of the product.
Regarding binary code optimization, which the compiler can carry out using the type information for systems programming languages, if the compiler knows, for example, that the arguments to a multiplication operator are integers, it can translate that part of the source code into a stream of highly efficient assembly code instructions for integer multiplication. On the other hand, if no such assumption can be made at compile time, the compiler would either defer such data type checking until run time, which would extract some performance penalty at run time, or the compiler would make a default assumption about the data type and simply invoke a more general (albeit less efficient) code for carrying out the operation.
Data typing is much less important for scripting languages. To permit easy interfacing between the components that are glued together in a script, a scripting language must be as typeless as possible. When the outputs and the inputs of a pair of components are strongly typed, their interconnection may require some special code if there exist type incompatibilities between the outputs of one and the inputs to the other. Or it may become necessary to eliminate the incompatibilities by altering the input/output data types of the components; this would call for changing the source code and recompilation, something that is not always feasible with commercially purchased libraries in binary form. However, in a typeless environment, the output of one component can be taken as a generic stream of bytes and accepted by the receiving component just on that basis. Or, as is even more commonly the case with scripting languages because string processing is the main focus of such languages, the components can be assumed to produce character streams at their outputs and to accept character streams at their inputs.
Compile-time type checking versus run-time type checking:
When a variable is typeless at compile time, the compiler must generate additional code to determine at run time that the value referenced by the variable is appropriate to the operation at hand. For example, if an operation in a script calls for two strings to be concatenated with the + operator applied to operands whose types are not known to the compiler, the compiler must generate additional instructions so that a run-time determination about the appropriateness of the operands for string concatenation can be made. For obvious reasons, this will extract run-time performance penalties; but for scripting languages that is not a big issue since the overall performance of an application is determined more by the speed of execution of the components that are glued together by a script than by the workings of the script itself.
High level versus low level:
Compared to systems programming languages, scripting languages are at a higher level, meaning that, on the average, each line of code in a script gets translated into a larger number of machine instructions compared to each line of code in a program in a systems programming language. The lowest-level language in which one can write a computer program is, of course, the assembly language — the assembler translates each line of the assembly code into one machine instruction. It has been estimated that each line of code in a systems programming language translates into five machine instructions on the average. On the other hand, each line of a script may get translated into hundreds or thousands of machine instructions. For example, an innocuous looking script statement may call for substring substitution in a text file. The actual work of substring substitution is likely to be carried out by a sophisticated regular expression engine under the hood. The point is that the primitive operations in a script often embody a much higher level of functionality than the primitive operations in a systems programming language.
Programmer productivity:
It was observed by Boehm [3] that programmers can write roughly the same number of lines of code per year regardless of the language. This implies that the higher the level of a language, the greater the programming productivity. Therefore, if a task can be accomplished with equal computational efficiency when programmed in a systems programming language or a scripting language, the latter should be our choice since scripting languages are inherently higher level. But, of course, not all tasks can be programmed in a computationally efficient manner in a scripting language. So for a complex application, one must devise a component-based framework in which the components themselves are programmed in systems programming languages and in which the integration of the components takes place in a scripting language.
Abstraction level of the fundamental data types:
The fundamental data types of scripting languages include high-level structures. On the other hand, the systems programming languages use only fine-grained fundamental data types. For example, both Perl and Python support hash tables for storing and manipulating associative lists of (
key, value
) pairs. Both languages also support flexible arrays for storing dynamically alterable lists of objects. On the other hand, C’s fundamental data types are
int, float, double
, and so on. It takes virtually no programming effort to use the high-level data types that are built into scripting languages.
Ability to process a string as a snippet of code:
As previously mentioned, many scripting languages provide an evaluation function, usually named eval, that takes a string as an argument and then processes it as if it were a piece of code. The string argument may or may not be known at compile time; in other words, it may become available only at run time. A systems programming language like C or C++ cannot provide such a facility because of the distinct and separate compilation and execution stages. After a C or a C++ program is compiled and subject to run-time execution, the compiler cannot be invoked again (at least not easily). Therefore, you cannot construct a string of legal C code and feed the string as an argument to some sort of an evaluation function. Since compilation in a scripting language essentially consists of checking the correctness of each statement of the script and, possibly, transforming it into a parse tree independently of the other statements, compilation and execution can be intermixed. If needed, the run time can invoke the compiler on a string if the string needs to be subsequently interpreted as a piece of code, followed by the execution of the code.
Function overloading or lack thereof:
High-level systems programming languages like C++ and Java allow the same function name to be defined with different numbers and/or types of arguments. This allows for the source code to be more programmer-friendly, as when a class is provided with multiple constructors, each with a different parameter structure. To elaborate, a class constructor in C++ and Java has the same name as the name of the class. So if a class is to be provided with multiple constructors, because there is a need to construct instance objects in different ways, it must be possible to overload the class name when used as a constructor. Scripting languages like Perl and Python do not allow for function overloading. If multiple definitions are provided for the same function name, it is the latest definition that will be used, regardless of the argument structure in the function call and regardless of the parameter structure in the function definition. So in the following snippet of Perl code,
it is the definition in line (D) that will be invoked in response to all four function calls in lines (A), (C), (E), and (F) even though there is a better “match.” between the function calls of lines (A), (C), and (F) with the function definition in line (B).
6
7
Execution speed versus development speed:
Scripting languages sacrifice execution speed for development speed, meaning that if we actually wrote an entire application first in a scripting language and then in a systems programming language, the software development cycle for the former is likely to be shorter. However, the latter would mostly likely run faster. In actual practice, scripting languages are not used for developing an application from scratch. As previously mentioned, they are used for plugging together the components that are often written in other languages, with the understanding that the components may require fine-grained data structures and complex algorithmic control that are best implemented in a systems programming language. For complex applications programming that involves both scripting and systems programming languages in the manner indicated, the overall speed of execution would be determined primarily by the speed at which the components are executed.
Figure 1.1 shows graphically the relationship between assembly languages, systems programming languages, and scripting languages with regard to the level at which the languages operate and the extent of data typing demanded by the languages. As mentioned previously, scripting languages, as higher level languages, give rise to many more machine instructions per statement than the lower level systems programming languages.
Fig. 1.1 A comparison of various systems programming languages, scripting languages, and assembly languages with respect to two criteria: the degree of typing required by each language and the average number of machine instructions per statement of the language. (From Ousterhout [51].)
We will now provide the reader with an overview of the layout of this book. First we review the basics of Perl in Chapter 2 and the basics of Python in Chapter 3. Considering that both Perl and Python are large languages and that entire books have been devoted to each, our reviews here are by necessity somewhat terse and intended primarily to aid the explanations in the rest of the book.
Chapter 4 presents a review of regular expressions. Text processing is a major preoccupation of scripting languages and regular expressions are central to text processing. Regular expressions in both Perl and Python work the same way, although the precise syntax to use for achieving a desired regular-expression-based functionality is obviously different.
Chapter 5 then goes into the concept of a reference in Perl. Class type objects in Perl are manipulated through references (blessed references, to be precise). References are also needed in Perl for constructing nested data structures, such as lists of lists, hashes with values (for the keys) consisting of lists or other hashes, and so on. A reference is essentially a disguised pointer to an object. When comparing Perl with Python, it is interesting to note that whereas the use of a reference in Perl is optional, in Python all objects are manipulated through their references (although in a manner that is transparent to the programmer).
Chapter 6 presents the basic syntax of a class définition in Perl. Also in this chapter are other key Perl OO notions, such as instance variables and instance methods, class variables and class methods, object destruction, and so on. Chapter 7 does the same for Python. Chapter 7 also goes into the fact that Python associates attributes with all objects, attributes that are accessed through the dotted-operator notation common to object-oriented programming. An instance constructed from a user-defined class comes with system-supplied attributes just as much as the class itself. Chapter 7 also discusses in detail the differences between the classic classes and new-style classes in Python.
Chapter 8 first discusses what is meant by inheritance in Perl OO, though the same arguments also apply to Python 0 0, and then goes on to show how the definitions presented earlier in Chapter 6 can be extended to form subclasses. Also included in this chapter is a discussion on how inheritance is used to search for an applicable method in a class hierarchy, and so on. Chapter 9 treats similar topics in Python. Chapter 9 also goes into the details of what every new-style class in Python inherits from the root class object and into issues related to subclassing the built-in classes of Python. Chapter 9 also includes a discussion of the Method Resolution Order (MRO) in Python (this is the order in which the inheritance graph is searched for an applicable method) and the differences in MRO between the classic classes and the new-style classes in Python.
Chapter 10 reviews exception handling in Perl and Python. Software practices of today demand defensive programming. Software that involves network communications,
