from Hacker News

Go 1.8 toolchain improvements

by spacey on 11/19/16, 1:18 PM with 16 comments

  • by svetly0 on 11/19/16, 4:48 PM

    MIPS32 support - kudos to Vladimir Stefanovic and Imagination Technologies for making this happen. Many people from the embedded world will also greatly appreciate support for soft-float MIPS32 hardware.
  • by 0xmohit on 11/19/16, 5:25 PM

    There are scores of other optimizations [0] as well:

      Optimizations:
      
      bytes, strings: optimize for ASCII sets (CL 31593)
      bytes, strings: optimize multi-byte index operations on s390x (CL 32447)
      bytes,strings: use IndexByte more often in Index on AMD64 (CL 31690)
      bytes: Use the same algorithm as strings for Index (CL 22550)
      bytes: improve WriteRune performance (CL 28816)
      bytes: improve performance for bytes.Compare on ppc64x (CL 30949)
      bytes: make IndexRune faster (CL 28537)
      cmd/asm, go/build: invoke cmd/asm only once per package (CL 27636)
      cmd/compile, cmd/link: more efficient typelink generation (CL 31772)
      cmd/compile, cmd/link: stop generating unused go.string.hdr symbols. (CL 31030)
      cmd/compile,runtime: redo how map assignments work (CL 30815)
      cmd/compile/internal/obj/x86: eliminate some function prologues (CL 24814)
      cmd/compile/internal/ssa: generate bswap on AMD64 (CL 32222)
      cmd/compile: accept literals in samesafeexpr (CL 26666)
      cmd/compile: add more non-returning runtime calls (CL 28965)
      cmd/compile: add size hint to map literal allocations (CL 23558)
      cmd/compile: be more aggressive in tighten pass for booleans (CL 28390)
      cmd/compile: directly construct Fields instead of ODCLFIELD nodes (CL 31670)
      cmd/compile: don't reserve X15 for float sub/div any more (CL 28272)
      cmd/compile: don’t generate pointless gotos during inlining (CL 27461)
      cmd/compile: fold negation into comparison operators (CL 28232)
      cmd/compile: generate makeslice calls with int arguments (CL 27851)
      cmd/compile: handle e == T comparison more efficiently (CL 26660)
      cmd/compile: improve s390x SSA rules for logical ops (CL 31754)
      cmd/compile: improve s390x rules for folding ADDconst into loads/stores (CL 30616)
      cmd/compile: improve string iteration performance (CL 27853)
      cmd/compile: improve tighten pass (CL 28712)
      cmd/compile: inline _, ok = i.(T) (CL 26658)
      cmd/compile: inline atomics from runtime/internal/atomic on amd64 (CL 27641, CL 27813)
      cmd/compile: inline convT2{I,E} when result doesn't escape (CL 29373)
      cmd/compile: inline x, ok := y.(T) where T is a scalar (CL 26659)
      cmd/compile: intrinsify atomic operations on s390x (CL 31614)
      cmd/compile: intrinsify math/big.mulWW, divWW on AMD64 (CL 30542)
      cmd/compile: intrinsify runtime/internal/atomic.Xaddint64 (CL 29274)
      cmd/compile: intrinsify slicebytetostringtmp when not instrumenting (CL 29017)
      cmd/compile: intrinsify sync/atomic for amd64 (CL 28076)
      cmd/compile: make [0]T and [1]T SSAable types (CL 32416)
      cmd/compile: make link register allocatable in non-leaf functions (CL 30597)
      cmd/compile: missing float indexed loads/stores on amd64 (CL 28273)
      cmd/compile: move stringtoslicebytetmp to the backend (CL 32158)
      cmd/compile: only generate ·f symbols when necessary (CL 31031)
      cmd/compile: optimize bool to int conversion (CL 22711)
      cmd/compile: optimize integer "in range" expressions (CL 27652)
      cmd/compile: remove Zero and NilCheck for newobject (CL 27930)
      cmd/compile: remove duplicate nilchecks (CL 29952)
      cmd/compile: remove some write barriers for stack writes (CL 30290)
      cmd/compile: simplify div/mod on ARM (CL 29390)
      cmd/compile: statically initialize some interface values (CL 26668)
      cmd/compile: unroll comparisons to short constant strings (CL 26758)
      cmd/compile: use 2-result divide op (CL 25004)
      cmd/compile: use masks instead of branches for slicing (CL 32022)
      cmd/compile: when inlining ==, don’t take the address of the values (CL 22277)
      container/heap: remove one unnecessary comparison in Fix (CL 24273)
      crypto/elliptic: add s390x assembly implementation of NIST P-256 Curve (CL 31231)
      crypto/sha256: improve performance for sha256.block on ppc64le (CL 32318)
      crypto/sha512: improve performance for sha512.block on ppc64le (CL 32320)
      crypto/{aes,cipher}: add optimized implementation of AES-GCM for s390x (CL 30361)
      encoding/asn1: reduce allocations in Marshal (CL 27030)
      encoding/csv: avoid allocations when reading records (CL 24723)
      encoding/hex: change lookup table from string to array (CL 27254)
      encoding/json: Use a lookup table for safe characters (CL 24466)
      hash/crc32: improve the AMD64 implementation using SSE4.2 (CL 24471)
      hash/crc32: improve the AMD64 implementation using SSE4.2 (CL 27931)
      hash/crc32: improve the processing of the last bytes in the SSE4.2 code for AMD64 (CL 24470)
      image/color: improve speed of RGBA methods (CL 31773)
      image/draw: optimize drawFillOver as drawFillSrc for opaque fills (CL 28790)
      math/big: 10%-20% faster float->decimal conversion (CL 31250, CL 31275)
      math/big: avoid allocation in float.{Add, Sub} when there's no aliasing (CL 23568)
      math/big: make division faster (CL 30613)
      math/big: use array instead of slice for deBruijn lookups (CL 26663)
      math/big: uses SIMD for some math big functions on s390x (CL 32211)
      math: speed up Gamma(+Inf) (CL 31370)
      math: speed up bessel functions on AMD64 (CL 28086)
      math: use SIMD to accelerate some scalar math functions on s390x (CL 32352)
      reflect: avoid zeroing memory that will be overwritten (CL 28011)
      regexp: avoid alloc in QuoteMeta when not quoting (CL 31395)
      regexp: reduce mallocs in Regexp.Find* and Regexp.ReplaceAll* (CL 23030)
      runtime: cgo calls are about 100ns faster (CL 29656, CL 30080)
      runtime: defer is now 2X faster (CL 29656)
      runtime: implement getcallersp in Go (CL 29655)
      runtime: improve memmove for amd64 (CL 22515, CL 29590)
      runtime: increase malloc size classes (CL 24493)
      runtime: large objects no longer cause significant goroutine pauses (CL 23540)
      runtime: make append only clear uncopied memory (CL 30192)
      runtime: make assists perform root jobs (CL 32432)
      runtime: memclr perf improvements on ppc64x (CL 30373)
      runtime: minor string/rune optimizations (CL 27460)
      runtime: optimize defer code (CL 29656)
      runtime: remove a load and shift from scanobject (CL 22712)
      runtime: remove defer from standard cgo call (CL 30080)
      runtime: speed up StartTrace with lots of blocked goroutines (CL 25573)
      runtime: speed up non-ASCII rune decoding (CL 28490)
      strconv: make FormatFloat slowpath a little faster (CL 30099)
      strings: add special cases for Join of 2 and 3 strings (CL 25005)
      strings: make IndexRune faster (CL 28546)
      strings: use AVX2 for Index if available (CL 22551)
      strings: use Index in Count (CL 28586)
      syscall: avoid convT2I allocs for common Windows error values (CL 28484, CL 28990)
      text/template: improve lexer performance in finding left delimiters (CL 24863)
      unicode/utf8: optimize ValidRune (CL 32122)
      unicode/utf8: reduce bounds checks in EncodeRune (CL 28492)
    
    [0] https://github.com/golang/go/blob/master/doc/go1.8.txt
  • by grabcocque on 11/19/16, 3:56 PM

    So, the language has now spent getting on for 18 months significant slower than it used to be, and nobody seems to have a real issues with this?