Doc fixes, wip - notebooks

This commit is contained in:
bluxmit 2022-05-30 07:24:06 +00:00
parent 9bb33423b5
commit d7d42a1ae7
404 changed files with 24282 additions and 280 deletions

800
LICENSE
View File

@ -1,201 +1,663 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
GNU AFFERO GENERAL PUBLIC LICENSE
Version 3, 19 November 2007
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
Copyright (c) 2015 Ayuntamiento de Madrid
1. Definitions.
Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
Preamble
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
The GNU Affero General Public License is a free, copyleft license for
software and other kinds of works, specifically designed to ensure
cooperation with the community in the case of network server software.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
The licenses for most software and other practical works are designed
to take away your freedom to share and change the works. By contrast,
our General Public Licenses are intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
Developers that use our General Public Licenses protect your rights
with two steps: (1) assert copyright on the software, and (2) offer
you this License which gives you legal permission to copy, distribute
and/or modify the software.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
A secondary benefit of defending all users' freedom is that
improvements made in alternate versions of the program, if they
receive widespread use, become available for other developers to
incorporate. Many developers of free software are heartened and
encouraged by the resulting cooperation. However, in the case of
software used on network servers, this result may fail to come about.
The GNU General Public License permits making a modified version and
letting the public access it on a server without ever releasing its
source code to the public.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
The GNU Affero General Public License is designed specifically to
ensure that, in such cases, the modified source code becomes available
to the community. It requires the operator of a network server to
provide the source code of the modified version running there to the
users of that server. Therefore, public use of a modified version, on
a publicly accessible server, gives the public access to the source
code of the modified version.
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
An older license, called the Affero General Public License and
published by Affero, was designed to accomplish similar goals. This is
a different license, not a version of the Affero GPL, but Affero has
released a new version of the Affero GPL which permits relicensing under
this license.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
The precise terms and conditions for copying, distribution and
modification follow.
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
TERMS AND CONDITIONS
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
0. Definitions.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
"This License" refers to version 3 of the GNU Affero General Public License.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
"Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
"The Program" refers to any copyrightable work licensed under this
License. Each licensee is addressed as "you". "Licensees" and
"recipients" may be individuals or organizations.
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
To "modify" a work means to copy from or adapt all or part of the work
in a fashion requiring copyright permission, other than the making of an
exact copy. The resulting work is called a "modified version" of the
earlier work or a work "based on" the earlier work.
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
A "covered work" means either the unmodified Program or a work based
on the Program.
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
To "propagate" a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy. Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
To "convey" a work means any kind of propagation that enables other
parties to make or receive copies. Mere interaction with a user through
a computer network, with no transfer of a copy, is not conveying.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
An interactive user interface displays "Appropriate Legal Notices"
to the extent that it includes a convenient and prominently visible
feature that (1) displays an appropriate copyright notice, and (2)
tells the user that there is no warranty for the work (except to the
extent that warranties are provided), that licensees may convey the
work under this License, and how to view a copy of this License. If
the interface presents a list of user commands or options, such as a
menu, a prominent item in the list meets this criterion.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
1. Source Code.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
The "source code" for a work means the preferred form of the work
for making modifications to it. "Object code" means any non-source
form of a work.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
A "Standard Interface" means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that
is widely used among developers working in that language.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
The "System Libraries" of an executable work include anything, other
than the work as a whole, that (a) is included in the normal form of
packaging a Major Component, but which is not part of that Major
Component, and (b) serves only to enable use of the work with that
Major Component, or to implement a Standard Interface for which an
implementation is available to the public in source code form. A
"Major Component", in this context, means a major essential component
(kernel, window system, and so on) of the specific operating system
(if any) on which the executable work runs, or a compiler used to
produce the work, or an object code interpreter used to run it.
END OF TERMS AND CONDITIONS
The "Corresponding Source" for a work in object code form means all
the source code needed to generate, install, and (for an executable
work) run the object code and to modify the work, including scripts to
control those activities. However, it does not include the work's
System Libraries, or general-purpose tools or generally available free
programs which are used unmodified in performing those activities but
which are not part of the work. For example, Corresponding Source
includes interface definition files associated with source files for
the work, and the source code for shared libraries and dynamically
linked subprograms that the work is specifically designed to require,
such as by intimate data communication or control flow between those
subprograms and other parts of the work.
APPENDIX: How to apply the Apache License to your work.
The Corresponding Source need not include anything that users
can regenerate automatically from other parts of the Corresponding
Source.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
The Corresponding Source for a work in source code form is that
same work.
Copyright [yyyy] [name of copyright owner]
2. Basic Permissions.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met. This License explicitly affirms your unlimited
permission to run the unmodified Program. The output from running a
covered work is covered by this License only if the output, given its
content, constitutes a covered work. This License acknowledges your
rights of fair use or other equivalent, as provided by copyright law.
http://www.apache.org/licenses/LICENSE-2.0
You may make, run and propagate covered works that you do not
convey, without conditions so long as your license otherwise remains
in force. You may convey covered works to others for the sole purpose
of having them make modifications exclusively for you, or provide you
with facilities for running those works, provided that you comply with
the terms of this License in conveying all material for which you do
not control copyright. Those thus making or running the covered works
for you must do so exclusively on your behalf, under your direction
and control, on terms that prohibit them from making any copies of
your copyrighted material outside their relationship with you.
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Conveying under any other circumstances is permitted solely under
the conditions stated below. Sublicensing is not allowed; section 10
makes it unnecessary.
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article
11 of the WIPO copyright treaty adopted on 20 December 1996, or
similar laws prohibiting or restricting circumvention of such
measures.
When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention
is effected by exercising rights under this License with respect to
the covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's
users, your or third parties' legal rights to forbid circumvention of
technological measures.
4. Conveying Verbatim Copies.
You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.
You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.
5. Conveying Modified Source Versions.
You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:
a) The work must carry prominent notices stating that you modified
it, and giving a relevant date.
b) The work must carry prominent notices stating that it is
released under this License and any conditions added under section
7. This requirement modifies the requirement in section 4 to
"keep intact all notices".
c) You must license the entire work, as a whole, under this
License to anyone who comes into possession of a copy. This
License will therefore apply, along with any applicable section 7
additional terms, to the whole of the work, and all its parts,
regardless of how they are packaged. This License gives no
permission to license the work in any other way, but it does not
invalidate such permission if you have separately received it.
d) If the work has interactive user interfaces, each must display
Appropriate Legal Notices; however, if the Program has interactive
interfaces that do not display Appropriate Legal Notices, your
work need not make them do so.
A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit. Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.
6. Conveying Non-Source Forms.
You may convey a covered work in object code form under the terms
of sections 4 and 5, provided that you also convey the
machine-readable Corresponding Source under the terms of this License,
in one of these ways:
a) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by the
Corresponding Source fixed on a durable physical medium
customarily used for software interchange.
b) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by a
written offer, valid for at least three years and valid for as
long as you offer spare parts or customer support for that product
model, to give anyone who possesses the object code either (1) a
copy of the Corresponding Source for all the software in the
product that is covered by this License, on a durable physical
medium customarily used for software interchange, for a price no
more than your reasonable cost of physically performing this
conveying of source, or (2) access to copy the
Corresponding Source from a network server at no charge.
c) Convey individual copies of the object code with a copy of the
written offer to provide the Corresponding Source. This
alternative is allowed only occasionally and noncommercially, and
only if you received the object code with such an offer, in accord
with subsection 6b.
d) Convey the object code by offering access from a designated
place (gratis or for a charge), and offer equivalent access to the
Corresponding Source in the same way through the same place at no
further charge. You need not require recipients to copy the
Corresponding Source along with the object code. If the place to
copy the object code is a network server, the Corresponding Source
may be on a different server (operated by you or a third party)
that supports equivalent copying facilities, provided you maintain
clear directions next to the object code saying where to find the
Corresponding Source. Regardless of what server hosts the
Corresponding Source, you remain obligated to ensure that it is
available for as long as needed to satisfy these requirements.
e) Convey the object code using peer-to-peer transmission, provided
you inform other peers where the object code and Corresponding
Source of the work are being offered to the general public at no
charge under subsection 6d.
A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be
included in conveying the object code work.
A "User Product" is either (1) a "consumer product", which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling. In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage. For a particular
product received by a particular user, "normally used" refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product. A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.
"Installation Information" for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source. The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.
If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as
part of a transaction in which the right of possession and use of the
User Product is transferred to the recipient in perpetuity or for a
fixed term (regardless of how the transaction is characterized), the
Corresponding Source conveyed under this section must be accompanied
by the Installation Information. But this requirement does not apply
if neither you nor any third party retains the ability to install
modified object code on the User Product (for example, the work has
been installed in ROM).
The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed. Access to a
network may be denied when the modification itself materially and
adversely affects the operation of the network or violates the rules and
protocols for communication across the network.
Corresponding Source conveyed, and Installation Information provided,
in accord with this section must be in a format that is publicly
documented (and with an implementation available to the public in
source code form), and must require no special password or key for
unpacking, reading or copying.
7. Additional Terms.
"Additional permissions" are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall
be treated as though they were included in this License, to the extent
that they are valid under applicable law. If additional permissions
apply only to part of the Program, that part may be used separately
under those permissions, but the entire Program remains governed by
this License without regard to the additional permissions.
When you convey a copy of a covered work, you may at your option
remove any additional permissions from that copy, or from any part of
it. (Additional permissions may be written to require their own
removal in certain cases when you modify the work.) You may place
additional permissions on material, added by you to a covered work,
for which you have or can give appropriate copyright permission.
Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:
a) Disclaiming warranty or limiting liability differently from the
terms of sections 15 and 16 of this License; or
b) Requiring preservation of specified reasonable legal notices or
author attributions in that material or in the Appropriate Legal
Notices displayed by works containing it; or
c) Prohibiting misrepresentation of the origin of that material, or
requiring that modified versions of such material be marked in
reasonable ways as different from the original version; or
d) Limiting the use for publicity purposes of names of licensors or
authors of the material; or
e) Declining to grant rights under trademark law for use of some
trade names, trademarks, or service marks; or
f) Requiring indemnification of licensors and authors of that
material by anyone who conveys the material (or modified versions of
it) with contractual assumptions of liability to the recipient, for
any liability that these contractual assumptions directly impose on
those licensors and authors.
All other non-permissive additional terms are considered "further
restrictions" within the meaning of section 10. If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further
restriction, you may remove that term. If a license document contains
a further restriction but permits relicensing or conveying under this
License, you may add to a covered work material governed by the terms
of that license document, provided that the further restriction does
not survive such relicensing or conveying.
If you add terms to a covered work in accord with this section, you
must place, in the relevant source files, a statement of the
additional terms that apply to those files, or a notice indicating
where to find the applicable terms.
Additional terms, permissive or non-permissive, may be stated in the
form of a separately written license, or stated as exceptions;
the above requirements apply either way.
8. Termination.
You may not propagate or modify a covered work except as expressly
provided under this License. Any attempt otherwise to propagate or
modify it is void, and will automatically terminate your rights under
this License (including any patent licenses granted under the third
paragraph of section 11).
However, if you cease all violation of this License, then your
license from a particular copyright holder is reinstated (a)
provisionally, unless and until the copyright holder explicitly and
finally terminates your license, and (b) permanently, if the copyright
holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.
Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License. If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.
9. Acceptance Not Required for Having Copies.
You are not required to accept this License in order to receive or
run a copy of the Program. Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission
to receive a copy likewise does not require acceptance. However,
nothing other than this License grants you permission to propagate or
modify any covered work. These actions infringe copyright if you do
not accept this License. Therefore, by modifying or propagating a
covered work, you indicate your acceptance of this License to do so.
10. Automatic Licensing of Downstream Recipients.
Each time you convey a covered work, the recipient automatically
receives a license from the original licensors, to run, modify and
propagate that work, subject to this License. You are not responsible
for enforcing compliance by third parties with this License.
An "entity transaction" is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations. If propagation of a covered
work results from an entity transaction, each party to that
transaction who receives a copy of the work also receives whatever
licenses to the work the party's predecessor in interest had or could
give under the previous paragraph, plus a right to possession of the
Corresponding Source of the work from the predecessor in interest, if
the predecessor has it or can get it with reasonable efforts.
You may not impose any further restrictions on the exercise of the
rights granted or affirmed under this License. For example, you may
not impose a license fee, royalty, or other charge for exercise of
rights granted under this License, and you may not initiate litigation
(including a cross-claim or counterclaim in a lawsuit) alleging that
any patent claim is infringed by making, using, selling, offering for
sale, or importing the Program or any portion of it.
11. Patents.
A "contributor" is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based. The
work thus licensed is called the contributor's "contributor version".
A contributor's "essential patent claims" are all patent claims
owned or controlled by the contributor, whether already acquired or
hereafter acquired, that would be infringed by some manner, permitted
by this License, of making, using, or selling its contributor version,
but do not include claims that would be infringed only as a
consequence of further modification of the contributor version. For
purposes of this definition, "control" includes the right to grant
patent sublicenses in a manner consistent with the requirements of
this License.
Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to
make, use, sell, offer for sale, import and otherwise run, modify and
propagate the contents of its contributor version.
In the following three paragraphs, a "patent license" is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement). To "grant" such a patent license to a
party means to make such an agreement or commitment not to enforce a
patent against the party.
If you convey a covered work, knowingly relying on a patent license,
and the Corresponding Source of the work is not available for anyone
to copy, free of charge and under the terms of this License, through a
publicly available network server or other readily accessible means,
then you must either (1) cause the Corresponding Source to be so
available, or (2) arrange to deprive yourself of the benefit of the
patent license for this particular work, or (3) arrange, in a manner
consistent with the requirements of this License, to extend the patent
license to downstream recipients. "Knowingly relying" means you have
actual knowledge that, but for the patent license, your conveying the
covered work in a country, or your recipient's use of the covered work
in a country, would infringe one or more identifiable patents in that
country that you have reason to believe are valid.
If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties
receiving the covered work authorizing them to use, propagate, modify
or convey a specific copy of the covered work, then the patent license
you grant is automatically extended to all recipients of the covered
work and works based on it.
A patent license is "discriminatory" if it does not include within
the scope of its coverage, prohibits the exercise of, or is
conditioned on the non-exercise of one or more of the rights that are
specifically granted under this License. You may not convey a covered
work if you are a party to an arrangement with a third party that is
in the business of distributing software, under which you make payment
to the third party based on the extent of your activity of conveying
the work, and under which the third party grants, to any of the
parties who would receive the covered work from you, a discriminatory
patent license (a) in connection with copies of the covered work
conveyed by you (or copies made from those copies), or (b) primarily
for and in connection with specific products or compilations that
contain the covered work, unless you entered into that arrangement,
or that patent license was granted, prior to 28 March 2007.
Nothing in this License shall be construed as excluding or limiting
any implied license or other defenses to infringement that may
otherwise be available to you under applicable patent law.
12. No Surrender of Others' Freedom.
If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot convey a
covered work so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may
not convey it at all. For example, if you agree to terms that obligate you
to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.
13. Remote Network Interaction; Use with the GNU General Public License.
Notwithstanding any other provision of this License, if you modify the
Program, your modified version must prominently offer all users
interacting with it remotely through a computer network (if your version
supports such interaction) an opportunity to receive the Corresponding
Source of your version by providing access to the Corresponding Source
from a network server at no charge, through some standard or customary
means of facilitating copying of software. This Corresponding Source
shall include the Corresponding Source for any work covered by version 3
of the GNU General Public License that is incorporated pursuant to the
following paragraph.
Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU General Public License into a single
combined work, and to convey the resulting work. The terms of this
License will continue to apply to the part which is the covered work,
but the work with which it is combined will remain governed by version
3 of the GNU General Public License.
14. Revised Versions of this License.
The Free Software Foundation may publish revised and/or new versions of
the GNU Affero General Public License from time to time. Such new versions
will be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the
Program specifies that a certain numbered version of the GNU Affero General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation. If the Program does not specify a version number of the
GNU Affero General Public License, you may choose any version ever published
by the Free Software Foundation.
If the Program specifies that a proxy can decide which future
versions of the GNU Affero General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.
Later license versions may give you additional or different
permissions. However, no additional obligations are imposed on any
author or copyright holder as a result of your choosing to follow a
later version.
15. Disclaimer of Warranty.
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. Limitation of Liability.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.
17. Interpretation of Sections 15 and 16.
If the disclaimer of warranty and limitation of liability provided
above cannot be given local legal effect according to their terms,
reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
Also add information on how to contact you by electronic and paper mail.
If your software can interact with users remotely through a computer
network, you should also make sure that it provides a way for users to
get its source. For example, if your program is a web application, its
interface could display a "Source" link that leads users to an archive
of the code. There are many ways you could offer source, and different
solutions will be better for different programs; see section 13 for the
specific requirements.
You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU AGPL, see
<http://www.gnu.org/licenses/>.

View File

@ -0,0 +1,3 @@
# Closure workspace
__WIP__

View File

@ -7,7 +7,7 @@
Docker image with Erlang, Elixir and browser-based VS-Code version.
<p align="center">
<img src="img/codeserver-collage-sm.jpg" alt="Collage" width="750">
<img src="https://raw.githubusercontent.com/bluxmit/alnoda-workspaces/main/workspaces/codeserver-workspace/img/codeserver-collage-sm.jpg" alt="Collage" width="750">
</p>
## Why this images

View File

@ -5,6 +5,10 @@
# Kafka workspace
Single-node Kafka cluster together with several Kafka CLI tools in containerized dev/admin environment.
<p align="center">
<img src="img/kafka-wid-collage.jpg" alt="Collage" width="750">
</p>
## Why this images
1. If you need a tool to interact with Kakfa, such as produce and consume events, explore, manage, query

View File

@ -7,6 +7,8 @@
MkDocs-MagicSpace is an all-in-one tool, carefully crafted to make the development of gorgeous documentation
websites like [**this one**](https://mkdocs-magicspace.alnoda.org/) as easy as possible.
> Known Mermaid problem after major update (fix pending)
<p align="center">
<img src="https://raw.githubusercontent.com/bluxmit/alnoda-workspaces/main/workspaces/mkdocs-magicspace/img/mkdocs-collage.png" alt="Collage" width="750">
</p>

View File

@ -15,7 +15,7 @@ mkdocstrings==0.18.1
mkdocstrings-sourcelink==0.3.2
# https://github.com/fralau/mkdocs-mermaid2-plugin
mkdocs-mermaid2-plugin==0.5.1
mkdocs-mermaid2-plugin==0.6.0
# https://github.com/backstage/mkdocs-monorepo-plugin
mkdocs-monorepo-plugin==1.0.1
@ -38,9 +38,6 @@ mkdocs-redirects==1.0.3
# https://github.com/midnightprioriem/mkdocs-autolinks-plugin
mkdocs-autolinks-plugin==0.4.0
# https://github.com/fralau/mkdocs-mermaid2-plugin
mkdocs-mermaid2-plugin==0.5.1
# https://github.com/fiinnnn/mkdocs-mktemplate-plugin
mkdocs-mktemplate-plugin==1.0.0

View File

@ -0,0 +1,206 @@
ARG docker_registry=docker.io/alnoda
ARG image_tag=2.2-3.8
FROM ${docker_registry}/python-workspace:${image_tag}
USER root
################################################################# JUPYTER
# Corresponds to "Jupyter-base-dev"
###################################
ARG NB_USER="abc"
ARG NB_UID="8877"
ARG NB_GID="8877"
# Fix DL4006
SHELL ["/bin/bash", "-o", "pipefail", "-c"]
# ---- Miniforge installer ----
# Default values can be overridden at build time
# (ARGS are in lower case to distinguish them from ENV)
# Check https://github.com/conda-forge/miniforge/releases
# Conda version
ARG conda_version="4.9.2"
# Miniforge installer patch version
ARG miniforge_patch_number="5"
# Miniforge installer architecture
ARG miniforge_arch="x86_64"
# Python implementation to use
# can be either Miniforge3 to use Python or Miniforge-pypy3 to use PyPy
ARG miniforge_python="Miniforge3"
# Miniforge archive to install
ARG miniforge_version="${conda_version}-${miniforge_patch_number}"
# Miniforge installer
ARG miniforge_installer="${miniforge_python}-${miniforge_version}-Linux-${miniforge_arch}.sh"
# Miniforge checksum
ARG miniforge_checksum="49dddb3998550e40adc904dae55b0a2aeeb0bd9fc4306869cc4a600ec4b8b47c"
# Install all OS dependencies for notebook server that starts but lacks all
# features (e.g., download as all possible file formats)
RUN DEBIAN_FRONTEND=noninteractive apt-get update \
&& apt-get install -yq --no-install-recommends \
wget \
ca-certificates \
sudo \
locales \
fonts-liberation \
run-one \
&& apt-get clean && rm -rf /var/lib/apt/lists/*
RUN echo "en_US.UTF-8 UTF-8" > /etc/locale.gen && \
locale-gen
# Configure environment
ENV CONDA_DIR=/opt/conda \
SHELL=/bin/bash \
NB_USER=$NB_USER \
NB_UID=$NB_UID \
NB_GID=$NB_GID \
LC_ALL=en_US.UTF-8 \
LANG=en_US.UTF-8 \
LANGUAGE=en_US.UTF-8
ENV PATH=$CONDA_DIR/bin:$PATH \
HOME=/home/$NB_USER \
CONDA_VERSION="${conda_version}" \
MINIFORGE_VERSION="${miniforge_version}"
# Copy a script that we will use to correct permissions after running certain commands
COPY jupyter/fix-permissions /usr/local/bin/fix-permissions
RUN chmod a+rx /usr/local/bin/fix-permissions
# Enable prompt color in the skeleton .bashrc before creating the default NB_USER
# hadolint ignore=SC2016
RUN sed -i 's/^#force_color_prompt=yes/force_color_prompt=yes/' /etc/skel/.bashrc && \
# Add call to conda init script see https://stackoverflow.com/a/58081608/4413446
echo 'eval "$(command conda shell.bash hook 2> /dev/null)"' >> /etc/skel/.bashrc
# Create NB_USER with name and in the 'users' group
# and make sure these dirs are writable by the `users` group.
RUN echo "auth requisite pam_deny.so" >> /etc/pam.d/su && \
sed -i.bak -e 's/^%admin/#%admin/' /etc/sudoers && \
sed -i.bak -e 's/^%sudo/#%sudo/' /etc/sudoers && \
useradd -m -s /bin/bash -N -u $NB_UID $NB_USER || true && \
mkdir -p $CONDA_DIR && \
chown $NB_USER:$NB_GID $CONDA_DIR && \
chmod g+w /etc/passwd && \
fix-permissions $HOME && \
fix-permissions $CONDA_DIR
USER $NB_UID
ARG PYTHON_VERSION=default
# Setup work directory for backward-compatibility
RUN mkdir "/home/$NB_USER/work" && \
fix-permissions "/home/$NB_USER"
# Install conda as vadym and check the sha256 sum provided on the download site
WORKDIR /tmp
# Prerequisites installation: conda, pip, tini
RUN wget --quiet "https://github.com/conda-forge/miniforge/releases/download/${miniforge_version}/${miniforge_installer}" && \
echo "${miniforge_checksum} *${miniforge_installer}" | sha256sum --check && \
/bin/bash "${miniforge_installer}" -f -b -p $CONDA_DIR && \
rm "${miniforge_installer}" && \
# Conda configuration see https://conda.io/projects/conda/en/latest/configuration.html
echo "conda ${CONDA_VERSION}" >> $CONDA_DIR/conda-meta/pinned && \
conda config --system --set auto_update_conda false && \
conda config --system --set show_channel_urls true && \
if [ ! $PYTHON_VERSION = 'default' ]; then conda install --yes python=$PYTHON_VERSION; fi && \
conda list python | grep '^python ' | tr -s ' ' | cut -d '.' -f 1,2 | sed 's/$/.*/' >> $CONDA_DIR/conda-meta/pinned && \
conda install --quiet --yes \
"conda=${CONDA_VERSION}" \
'pip' \
'tini=3.0.1' && \
conda update --all --quiet --yes && \
conda list tini | grep tini | tr -s ' ' | cut -d ' ' -f 1,2 >> $CONDA_DIR/conda-meta/pinned && \
conda clean --all -f -y && \
rm -rf /home/$NB_USER/.cache/yarn && \
fix-permissions $CONDA_DIR && \
fix-permissions /home/$NB_USER
# Install Jupyter Notebook, Lab, and Hub
# Generate a notebook server config
# Cleanup temporary files
# Correct permissions
# Do all this in a single RUN command to avoid duplicating all of the
# files across image layers when the permissions change
RUN conda install --quiet --yes \
'notebook=6.2.0' \
'jupyterhub=1.3.0' \
'jupyterlab=3.0.5' && \
conda clean --all -f -y && \
npm cache clean --force && \
jupyter notebook --generate-config && \
rm -rf $CONDA_DIR/share/jupyter/lab/staging && \
rm -rf /home/$NB_USER/.cache/yarn && \
fix-permissions $CONDA_DIR && \
fix-permissions /home/$NB_USER
# Copy local files as late as possible to avoid cache busting
COPY jupyter/start.sh jupyter/start-notebook.sh jupyter/start-singleuser.sh /usr/local/bin/
# Currently need to have both jupyter_notebook_config and jupyter_server_config to support classic and lab
COPY jupyter/jupyter_notebook_config.py /etc/jupyter/
# Fix permissions on /etc/jupyter as root
USER root
# Prepare upgrade to JupyterLab V3.0 #1205
RUN sed -re "s/c.NotebookApp/c.ServerApp/g" \
/etc/jupyter/jupyter_notebook_config.py > /etc/jupyter/jupyter_server_config.py
RUN chmod 0777 /usr/local/bin/start-notebook.sh
RUN mkdir -p /var/log/supervisord
RUN mkdir -p /home/project/notebooks
################################################################# NBVIEWER
ENV LANG=C.UTF-8
RUN DEBIAN_FRONTEND=noninteractive apt-get update \
&& apt-get install -yq --no-install-recommends \
ca-certificates \
libcurl4 \
git \
&& apt-get clean && rm -rf /var/lib/apt/lists/*
COPY --from=builder /wheels /wheels
RUN python3 -mpip install --no-cache /wheels/*
# To change the number of threads use env var NBVIEWER_THREADS
# docker run -d -e NBVIEWER_THREADS=4 -p 80:8080 nbviewer
ENV NBVIEWER_THREADS 2 # default - 2 threads
RUN mkdir -p /home/nbviewer
################################################################# PACKAGES & MODULES
#RUN apt-get install -y default-libmysqlclient-dev build-essential
COPY requirements-base-data.txt /home/installed-python-packages/requirements-base-data.txt
RUN pip install -r /home/installed-python-packages/requirements-base-data.txt
################## USER
RUN chown -R vadym /var/log/supervisord
RUN chown -R vadym /home/project/notebooks
RUN chown -R vadym /home/vadym
RUN chown -R vadym /opt/conda /etc/jupyter
RUN chown -R vadym ${LUIGI_CONFIG_DIR} /etc/service/luigid/
RUN chown -R vadym /home/installed-python-packages
RUN chown -R vadym /home/examples/luigi
RUN chown -R vadym /home/nbviewer
COPY data-workstation.conf /etc/supervisord/data-workstation.conf
USER vadym

View File

@ -0,0 +1,267 @@
# DATA WORKSTATION !!!
# docker run -it -p 8085:8085 -p 8086:8086 -p 3000:3000 -p 8001:8000 -p 3012:3012 -p 8092:8092 -p 8448:8448 rg.fr-par.scw.cloud/dgym/base-data-workstation:3.9.0
ARG docker_registry=docker.io/alnoda
ARG image_tag=2.2-3.8
##################################################################################################################################
######################## BUILD (NBviewer)
##################################################################################################################################
FROM python:3.8-buster as builder
LABEL maintainer="Vadym Dolinin <vadym.doli@gmail.com>"
ENV DEBIAN_FRONTEND=noninteractive
ENV LANG=C.UTF-8
RUN apt-get update \
&& apt-get install -yq --no-install-recommends \
ca-certificates \
libcurl4-gnutls-dev \
git \
nodejs \
npm
RUN apt-get install -y libmemcached-dev zlib1g-dev
# Python requirements
COPY nbviewer/requirements-dev.txt /srv/nbviewer/
COPY nbviewer/requirements.txt /srv/nbviewer/
RUN python3 -mpip install -r /srv/nbviewer/requirements-dev.txt
RUN python3 -mpip install -r /srv/nbviewer/requirements.txt
WORKDIR /srv/nbviewer
# Copy source tree in
COPY nbviewer /srv/nbviewer
RUN python3 setup.py build && \
python3 -mpip wheel -vv . -w /wheels
##################################################################################################################################
############################ FINAL
##################################################################################################################################
FROM ${docker_registry}/python-workspace:${image_tag}
USER root
RUN mkdir /home/vadym || true
RUN chown -R vadym "/home/vadym"
################################################################# LUIGI
ENV LUIGI_VERSION="3.0.3"
ARG LUIGI_CONFIG_DIR="/opt/luigi/"
RUN mkdir -p "${LUIGI_CONFIG_DIR}"
COPY luigi/logging.conf "${LUIGI_CONFIG_DIR}"
COPY luigi/luigi.conf "${LUIGI_CONFIG_DIR}"
RUN mkdir -p /etc/service/luigid/
COPY luigi/luigid.sh "${LUIGI_CONFIG_DIR}"
COPY luigi/examples /home/examples/luigi
################################################################# JUPYTER
# Corresponds to "Jupyter-base-dev"
###################################
ARG NB_USER="abc"
ARG NB_UID="8877"
ARG NB_GID="8877"
# Fix DL4006
SHELL ["/bin/bash", "-o", "pipefail", "-c"]
USER root
# ---- Miniforge installer ----
# Default values can be overridden at build time
# (ARGS are in lower case to distinguish them from ENV)
# Check https://github.com/conda-forge/miniforge/releases
# Conda version
ARG conda_version="4.9.2"
# Miniforge installer patch version
ARG miniforge_patch_number="5"
# Miniforge installer architecture
ARG miniforge_arch="x86_64"
# Python implementation to use
# can be either Miniforge3 to use Python or Miniforge-pypy3 to use PyPy
ARG miniforge_python="Miniforge3"
# Miniforge archive to install
ARG miniforge_version="${conda_version}-${miniforge_patch_number}"
# Miniforge installer
ARG miniforge_installer="${miniforge_python}-${miniforge_version}-Linux-${miniforge_arch}.sh"
# Miniforge checksum
ARG miniforge_checksum="49dddb3998550e40adc904dae55b0a2aeeb0bd9fc4306869cc4a600ec4b8b47c"
# Install all OS dependencies for notebook server that starts but lacks all
# features (e.g., download as all possible file formats)
RUN DEBIAN_FRONTEND=noninteractive apt-get update \
&& apt-get install -yq --no-install-recommends \
wget \
ca-certificates \
sudo \
locales \
fonts-liberation \
run-one \
&& apt-get clean && rm -rf /var/lib/apt/lists/*
RUN echo "en_US.UTF-8 UTF-8" > /etc/locale.gen && \
locale-gen
# Configure environment
ENV CONDA_DIR=/opt/conda \
SHELL=/bin/bash \
NB_USER=$NB_USER \
NB_UID=$NB_UID \
NB_GID=$NB_GID \
LC_ALL=en_US.UTF-8 \
LANG=en_US.UTF-8 \
LANGUAGE=en_US.UTF-8
ENV PATH=$CONDA_DIR/bin:$PATH \
HOME=/home/$NB_USER \
CONDA_VERSION="${conda_version}" \
MINIFORGE_VERSION="${miniforge_version}"
# Copy a script that we will use to correct permissions after running certain commands
COPY jupyter/fix-permissions /usr/local/bin/fix-permissions
RUN chmod a+rx /usr/local/bin/fix-permissions
# Enable prompt color in the skeleton .bashrc before creating the default NB_USER
# hadolint ignore=SC2016
RUN sed -i 's/^#force_color_prompt=yes/force_color_prompt=yes/' /etc/skel/.bashrc && \
# Add call to conda init script see https://stackoverflow.com/a/58081608/4413446
echo 'eval "$(command conda shell.bash hook 2> /dev/null)"' >> /etc/skel/.bashrc
# Create NB_USER with name and in the 'users' group
# and make sure these dirs are writable by the `users` group.
RUN echo "auth requisite pam_deny.so" >> /etc/pam.d/su && \
sed -i.bak -e 's/^%admin/#%admin/' /etc/sudoers && \
sed -i.bak -e 's/^%sudo/#%sudo/' /etc/sudoers && \
useradd -m -s /bin/bash -N -u $NB_UID $NB_USER || true && \
mkdir -p $CONDA_DIR && \
chown $NB_USER:$NB_GID $CONDA_DIR && \
chmod g+w /etc/passwd && \
fix-permissions $HOME && \
fix-permissions $CONDA_DIR
USER $NB_UID
ARG PYTHON_VERSION=default
# Setup work directory for backward-compatibility
RUN mkdir "/home/$NB_USER/work" && \
fix-permissions "/home/$NB_USER"
# Install conda as vadym and check the sha256 sum provided on the download site
WORKDIR /tmp
# Prerequisites installation: conda, pip, tini
RUN wget --quiet "https://github.com/conda-forge/miniforge/releases/download/${miniforge_version}/${miniforge_installer}" && \
echo "${miniforge_checksum} *${miniforge_installer}" | sha256sum --check && \
/bin/bash "${miniforge_installer}" -f -b -p $CONDA_DIR && \
rm "${miniforge_installer}" && \
# Conda configuration see https://conda.io/projects/conda/en/latest/configuration.html
echo "conda ${CONDA_VERSION}" >> $CONDA_DIR/conda-meta/pinned && \
conda config --system --set auto_update_conda false && \
conda config --system --set show_channel_urls true && \
if [ ! $PYTHON_VERSION = 'default' ]; then conda install --yes python=$PYTHON_VERSION; fi && \
conda list python | grep '^python ' | tr -s ' ' | cut -d '.' -f 1,2 | sed 's/$/.*/' >> $CONDA_DIR/conda-meta/pinned && \
conda install --quiet --yes \
"conda=${CONDA_VERSION}" \
'pip' \
'tini=0.18.0' && \
conda update --all --quiet --yes && \
conda list tini | grep tini | tr -s ' ' | cut -d ' ' -f 1,2 >> $CONDA_DIR/conda-meta/pinned && \
conda clean --all -f -y && \
rm -rf /home/$NB_USER/.cache/yarn && \
fix-permissions $CONDA_DIR && \
fix-permissions /home/$NB_USER
# Install Jupyter Notebook, Lab, and Hub
# Generate a notebook server config
# Cleanup temporary files
# Correct permissions
# Do all this in a single RUN command to avoid duplicating all of the
# files across image layers when the permissions change
RUN conda install --quiet --yes \
'notebook=6.2.0' \
'jupyterhub=1.3.0' \
'jupyterlab=3.0.5' && \
conda clean --all -f -y && \
npm cache clean --force && \
jupyter notebook --generate-config && \
rm -rf $CONDA_DIR/share/jupyter/lab/staging && \
rm -rf /home/$NB_USER/.cache/yarn && \
fix-permissions $CONDA_DIR && \
fix-permissions /home/$NB_USER
# Copy local files as late as possible to avoid cache busting
COPY jupyter/start.sh jupyter/start-notebook.sh jupyter/start-singleuser.sh /usr/local/bin/
# Currently need to have both jupyter_notebook_config and jupyter_server_config to support classic and lab
COPY jupyter/jupyter_notebook_config.py /etc/jupyter/
# Fix permissions on /etc/jupyter as root
USER root
# Prepare upgrade to JupyterLab V3.0 #1205
RUN sed -re "s/c.NotebookApp/c.ServerApp/g" \
/etc/jupyter/jupyter_notebook_config.py > /etc/jupyter/jupyter_server_config.py
RUN chmod 0777 /usr/local/bin/start-notebook.sh
RUN mkdir -p /var/log/supervisord
RUN mkdir -p /home/project/notebooks
################################################################# NBVIEWER
ENV LANG=C.UTF-8
RUN DEBIAN_FRONTEND=noninteractive apt-get update \
&& apt-get install -yq --no-install-recommends \
ca-certificates \
libcurl4 \
git \
&& apt-get clean && rm -rf /var/lib/apt/lists/*
COPY --from=builder /wheels /wheels
RUN python3 -mpip install --no-cache /wheels/*
# To change the number of threads use env var NBVIEWER_THREADS
# docker run -d -e NBVIEWER_THREADS=4 -p 80:8080 nbviewer
ENV NBVIEWER_THREADS 2 # default - 2 threads
RUN mkdir -p /home/nbviewer
################################################################# PACKAGES & MODULES
#RUN apt-get install -y default-libmysqlclient-dev build-essential
COPY requirements-base-data.txt /home/installed-python-packages/requirements-base-data.txt
RUN pip install -r /home/installed-python-packages/requirements-base-data.txt
################## USER
RUN chown -R vadym /var/log/supervisord
RUN chown -R vadym /home/project/notebooks
RUN chown -R vadym /home/vadym
RUN chown -R vadym /opt/conda /etc/jupyter
RUN chown -R vadym ${LUIGI_CONFIG_DIR} /etc/service/luigid/
RUN chown -R vadym /home/installed-python-packages
RUN chown -R vadym /home/examples/luigi
RUN chown -R vadym /home/nbviewer
COPY data-workstation.conf /etc/supervisord/data-workstation.conf
USER vadym

View File

@ -0,0 +1,80 @@
# Data Workstation
```sh
docker build -t data-workstation-base:3.8 --build-arg docker_registry=rg.fr-par.scw.cloud/dgym .
docker run -p 3000:3000 -p 8001:8000 -p 3012:3012 -p 8092:8092 -p 8448:8448 -p 9992:9992 -p 8085:8085 -p 8086:8086 -p 8082:8082 -p 8084:8084 data-workstation-base:3.8
docker run -p 3000:3000 -p 8001:8000 -p 3012:3012 -p 8092:8092 -p 8448:8448 -p 9992:9992 -p 8085:8085 -p 8086:8086 -p 8082:8082 -p 8084:8084 rg.fr-par.scw.cloud/dgym/python-workstation:3.8
```
## Luigi
Useful links:
- [Luigi Github Repo](https://github.com/spotify/luigi)
- [A Tutorial on Luigi, the Spotifys Pipeline](https://towardsdatascience.com/a-tutorial-on-luigi-spotifys-pipeline-5c694fb4113e)
- [Create your first ETL in Luigi](http://blog.adnansiddiqi.me/create-your-first-etl-in-luigi/)
- [Luigi on PyPi](https://pypi.org/project/luigi/)
## DBT
Useful links:
- [DBT main page](https://docs.getdbt.com/)
- [dbt(Data Build Tool) Tutorial](https://www.startdataengineering.com/post/dbt-data-build-tool-tutorial/)
- [DBT on PyPi](https://pypi.org/project/dbt/)
- [Analytics Engineering with dbt and PostgreSQL](https://dsotm-rsa.space/post/2019/09/01/analytics-engineering-with-dbt-data-build-tool-and-postgres-11/)
```sh
dbt init simple_dbt_project --adapter postgres
```
## Great expectations
Useful links:
- [Great Expectations main page](https://greatexpectations.io/)
- [Great Expectations documentation](https://docs.greatexpectations.io/en/latest/)
- [Great Expectations on PyPi](https://pypi.org/project/great-expectations/)
- [Understanding Great Expectations and How to Use It](https://medium.com/hashmapinc/understanding-great-expectations-and-how-to-use-it-7754c78962f4)
- [Know Your Data Pipelines with Great Expectations](https://medium.com/hashmapinc/know-your-data-pipelines-with-great-expectations-tool-b6d38a2e6f06)
https://www.startdataengineering.com/post/ensuring-data-quality-with-great-expectations/
https://medium.com/hashmapinc/understanding-great-expectations-and-how-to-use-it-7754c78962f4
https://docs.greatexpectations.io/en/stable/guides/tutorials/how_to_create_expectations.html
## Papermill
- [Papermill Report GitHub](https://github.com/ariadnext/papermill_report)
- [Automated Report Generation with Papermill: Part 1](https://pbpython.com/papermil-rclone-report-1.html)
- [Automated Report Generation with Papermill: Part 2]https://pbpython.com/papermil-rclone-report-2.html)
## Prefect
https://docs.prefect.io/core/getting_started/installation.html
## ADVANCED DATA
https://www.datacouncil.ai/blog/25-hot-new-data-tools-and-what-they-dont-do
## PREFECT
RUN pip install prefect==0.14.20
```
[program:prefect]
directory=/home/
command=/bin/sh -c " prefect backend server; prefect server start --ui-port 8095; prefect agent local start "
stderr_logfile = /var/log/prefect-stderr.log
stdout_logfile = /var/log/prefect-stdout.log
logfile_maxbytes = 1024
```
-p 8095:8095

View File

@ -0,0 +1,33 @@
[program:luigi]
directory=/opt/luigi/
command=/bin/sh -c " sh luigid.sh "
stderr_logfile = /var/log/luigi-stderr.log
stdout_logfile = /var/log/luigi-stdout.log
logfile_maxbytes = 1024
[program:jupyter]
directory=/usr/local/bin/
command=jupyter notebook --allow-root --ip='*' --NotebookApp.token='' --NotebookApp.password='' --notebook-dir=/home/project/notebooks --no-browser --port=8085
stderr_logfile = /var/log/jupyter-stderr.log
stdout_logfile = /var/log/jupyter-stdout.log
logfile_maxbytes = 1024
[program:jupytelab]
directory=/usr/local/bin/
command=jupyter lab --allow-root --ip='*' --NotebookApp.token='' --NotebookApp.password='' --notebook-dir=/home/project/notebooks --no-browser --port=8086
stderr_logfile = /var/log/jupyterlab-stderr.log
stdout_logfile = /var/log/jupyterlab-stdout.log
logfile_maxbytes = 1024
[program:nbviewer]
directory=/usr/local/bin/
command=python -m nbviewer --port=8084 --localfiles=/home/nbviewer
stderr_logfile = /var/log/nbviewer-stderr.log
stdout_logfile = /var/log/nbviewer-stdout.log
logfile_maxbytes = 1024

View File

@ -0,0 +1,35 @@
#!/bin/bash
# set permissions on a directory
# after any installation, if a directory needs to be (human) user-writable,
# run this script on it.
# It will make everything in the directory owned by the group $NB_GID
# and writable by that group.
# Deployments that want to set a specific user id can preserve permissions
# by adding the `--group-add users` line to `docker run`.
# uses find to avoid touching files that already have the right permissions,
# which would cause massive image explosion
# right permissions are:
# group=$NB_GID
# AND permissions include group rwX (directory-execute)
# AND directories have setuid,setgid bits set
set -e
for d in "$@"; do
find "$d" \
! \( \
-group $NB_GID \
-a -perm -g+rwX \
\) \
-exec chgrp $NB_GID {} \; \
-exec chmod g+rwX {} \;
# setuid, setgid *on directories only*
find "$d" \
\( \
-type d \
-a ! -perm -6000 \
\) \
-exec chmod +6000 {} \;
done

View File

@ -0,0 +1,55 @@
# Copyright (c) Jupyter Development Team.
# Distributed under the terms of the Modified BSD License.
from jupyter_core.paths import jupyter_data_dir
import subprocess
import os
import errno
import stat
c = get_config() # noqa: F821
c.NotebookApp.ip = '0.0.0.0'
c.NotebookApp.port = 8888
c.NotebookApp.open_browser = False
# https://github.com/jupyter/notebook/issues/3130
c.FileContentsManager.delete_to_trash = False
# Generate a self-signed certificate
if 'GEN_CERT' in os.environ:
dir_name = jupyter_data_dir()
pem_file = os.path.join(dir_name, 'notebook.pem')
try:
os.makedirs(dir_name)
except OSError as exc: # Python >2.5
if exc.errno == errno.EEXIST and os.path.isdir(dir_name):
pass
else:
raise
# Generate an openssl.cnf file to set the distinguished name
cnf_file = os.path.join(os.getenv('CONDA_DIR', '/usr/lib'), 'ssl', 'openssl.cnf')
if not os.path.isfile(cnf_file):
with open(cnf_file, 'w') as fh:
fh.write('''\
[req]
distinguished_name = req_distinguished_name
[req_distinguished_name]
''')
# Generate a certificate if one doesn't exist on disk
subprocess.check_call(['openssl', 'req', '-new',
'-newkey', 'rsa:2048',
'-days', '365',
'-nodes', '-x509',
'-subj', '/C=XX/ST=XX/L=XX/O=generated/CN=generated',
'-keyout', pem_file,
'-out', pem_file])
# Restrict access to the file
os.chmod(pem_file, stat.S_IRUSR | stat.S_IWUSR)
c.NotebookApp.certfile = pem_file
# Change default umask for all subprocesses of the notebook server if set in
# the environment
if 'NB_UMASK' in os.environ:
os.umask(int(os.environ['NB_UMASK'], 8))

View File

@ -0,0 +1,20 @@
#!/bin/bash
# Copyright (c) Jupyter Development Team.
# Distributed under the terms of the Modified BSD License.
set -e
wrapper=""
if [[ "${RESTARTABLE}" == "yes" ]]; then
wrapper="run-one-constantly"
fi
if [[ ! -z "${JUPYTERHUB_API_TOKEN}" ]]; then
# launched by JupyterHub, use single-user entrypoint
exec /usr/local/bin/start-singleuser.sh "$@"
elif [[ ! -z "${JUPYTER_ENABLE_LAB}" ]]; then
. /usr/local/bin/start.sh $wrapper jupyter lab "$@"
else
echo "WARN: Jupyter Notebook deprecation notice https://github.com/jupyter/docker-stacks#jupyter-notebook-deprecation-notice."
. /usr/local/bin/start.sh $wrapper jupyter notebook "$@"
fi

View File

@ -0,0 +1,39 @@
#!/bin/bash
# Copyright (c) Jupyter Development Team.
# Distributed under the terms of the Modified BSD License.
set -e
# set default ip to 0.0.0.0
if [[ "$NOTEBOOK_ARGS $@" != *"--ip="* ]]; then
NOTEBOOK_ARGS="--ip=0.0.0.0 $NOTEBOOK_ARGS"
fi
# handle some deprecated environment variables
# from DockerSpawner < 0.8.
# These won't be passed from DockerSpawner 0.9,
# so avoid specifying --arg=empty-string
if [ ! -z "$NOTEBOOK_DIR" ]; then
NOTEBOOK_ARGS="--notebook-dir='$NOTEBOOK_DIR' $NOTEBOOK_ARGS"
fi
if [ ! -z "$JPY_PORT" ]; then
NOTEBOOK_ARGS="--port=$JPY_PORT $NOTEBOOK_ARGS"
fi
if [ ! -z "$JPY_USER" ]; then
NOTEBOOK_ARGS="--user=$JPY_USER $NOTEBOOK_ARGS"
fi
if [ ! -z "$JPY_COOKIE_NAME" ]; then
NOTEBOOK_ARGS="--cookie-name=$JPY_COOKIE_NAME $NOTEBOOK_ARGS"
fi
if [ ! -z "$JPY_BASE_URL" ]; then
NOTEBOOK_ARGS="--base-url=$JPY_BASE_URL $NOTEBOOK_ARGS"
fi
if [ ! -z "$JPY_HUB_PREFIX" ]; then
NOTEBOOK_ARGS="--hub-prefix=$JPY_HUB_PREFIX $NOTEBOOK_ARGS"
fi
if [ ! -z "$JPY_HUB_API_URL" ]; then
NOTEBOOK_ARGS="--hub-api-url=$JPY_HUB_API_URL $NOTEBOOK_ARGS"
fi
NOTEBOOK_BIN="jupyterhub-singleuser"
. /usr/local/bin/start.sh $NOTEBOOK_BIN $NOTEBOOK_ARGS "$@"

View File

@ -0,0 +1,147 @@
#!/bin/bash
# Copyright (c) Jupyter Development Team.
# Distributed under the terms of the Modified BSD License.
set -e
# Exec the specified command or fall back on bash
if [ $# -eq 0 ]; then
cmd=( "bash" )
else
cmd=( "$@" )
fi
run-hooks () {
# Source scripts or run executable files in a directory
if [[ ! -d "$1" ]] ; then
return
fi
echo "$0: running hooks in $1"
for f in "$1/"*; do
case "$f" in
*.sh)
echo "$0: running $f"
source "$f"
;;
*)
if [[ -x "$f" ]] ; then
echo "$0: running $f"
"$f"
else
echo "$0: ignoring $f"
fi
;;
esac
done
echo "$0: done running hooks in $1"
}
run-hooks /usr/local/bin/start-notebook.d
# Handle special flags if we're root
if [ $(id -u) == 0 ] ; then
# Only attempt to change the jovyan username if it exists
if id jovyan &> /dev/null ; then
echo "Set username to: $NB_USER"
usermod -d /home/$NB_USER -l $NB_USER jovyan
fi
# handle home and working directory if the username changed
if [[ "$NB_USER" != "jovyan" ]]; then
# changing username, make sure homedir exists
# (it could be mounted, and we shouldn't create it if it already exists)
if [[ ! -e "/home/$NB_USER" ]]; then
echo "Relocating home dir to /home/$NB_USER"
mv /home/jovyan "/home/$NB_USER" || ln -s /home/jovyan "/home/$NB_USER"
fi
# if workdir is in /home/jovyan, cd to /home/$NB_USER
if [[ "$PWD/" == "/home/jovyan/"* ]]; then
newcwd="/home/$NB_USER/${PWD:13}"
echo "Setting CWD to $newcwd"
cd "$newcwd"
fi
fi
# Handle case where provisioned storage does not have the correct permissions by default
# Ex: default NFS/EFS (no auto-uid/gid)
if [[ "$CHOWN_HOME" == "1" || "$CHOWN_HOME" == 'yes' ]]; then
echo "Changing ownership of /home/$NB_USER to $NB_UID:$NB_GID with options '${CHOWN_HOME_OPTS}'"
chown $CHOWN_HOME_OPTS $NB_UID:$NB_GID /home/$NB_USER
fi
if [ ! -z "$CHOWN_EXTRA" ]; then
for extra_dir in $(echo $CHOWN_EXTRA | tr ',' ' '); do
echo "Changing ownership of ${extra_dir} to $NB_UID:$NB_GID with options '${CHOWN_EXTRA_OPTS}'"
chown $CHOWN_EXTRA_OPTS $NB_UID:$NB_GID $extra_dir
done
fi
# Change UID:GID of NB_USER to NB_UID:NB_GID if it does not match
if [ "$NB_UID" != $(id -u $NB_USER) ] || [ "$NB_GID" != $(id -g $NB_USER) ]; then
echo "Set user $NB_USER UID:GID to: $NB_UID:$NB_GID"
if [ "$NB_GID" != $(id -g $NB_USER) ]; then
groupadd -f -g $NB_GID -o ${NB_GROUP:-${NB_USER}}
fi
userdel $NB_USER
useradd --home /home/$NB_USER -u $NB_UID -g $NB_GID -G 100 -l $NB_USER
fi
# Enable sudo if requested
if [[ "$GRANT_SUDO" == "1" || "$GRANT_SUDO" == 'yes' ]]; then
echo "Granting $NB_USER sudo access and appending $CONDA_DIR/bin to sudo PATH"
echo "$NB_USER ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/notebook
fi
# Add $CONDA_DIR/bin to sudo secure_path
sed -r "s#Defaults\s+secure_path\s*=\s*\"?([^\"]+)\"?#Defaults secure_path=\"\1:$CONDA_DIR/bin\"#" /etc/sudoers | grep secure_path > /etc/sudoers.d/path
# Exec the command as NB_USER with the PATH and the rest of
# the environment preserved
run-hooks /usr/local/bin/before-notebook.d
echo "Executing the command: ${cmd[@]}"
exec sudo -E -H -u $NB_USER PATH=$PATH XDG_CACHE_HOME=/home/$NB_USER/.cache PYTHONPATH=${PYTHONPATH:-} "${cmd[@]}"
else
if [[ "$NB_UID" == "$(id -u jovyan 2>/dev/null)" && "$NB_GID" == "$(id -g jovyan 2>/dev/null)" ]]; then
# User is not attempting to override user/group via environment
# variables, but they could still have overridden the uid/gid that
# container runs as. Check that the user has an entry in the passwd
# file and if not add an entry.
STATUS=0 && whoami &> /dev/null || STATUS=$? && true
if [[ "$STATUS" != "0" ]]; then
if [[ -w /etc/passwd ]]; then
echo "Adding passwd file entry for $(id -u)"
cat /etc/passwd | sed -e "s/^jovyan:/nayvoj:/" > /tmp/passwd
echo "jovyan:x:$(id -u):$(id -g):,,,:/home/jovyan:/bin/bash" >> /tmp/passwd
cat /tmp/passwd > /etc/passwd
rm /tmp/passwd
else
echo 'Container must be run with group "root" to update passwd file'
fi
fi
# Warn if the user isn't going to be able to write files to $HOME.
if [[ ! -w /home/jovyan ]]; then
echo 'Container must be run with group "users" to update files'
fi
else
# Warn if looks like user want to override uid/gid but hasn't
# run the container as root.
if [[ ! -z "$NB_UID" && "$NB_UID" != "$(id -u)" ]]; then
echo 'Container must be run as root to set $NB_UID'
fi
if [[ ! -z "$NB_GID" && "$NB_GID" != "$(id -g)" ]]; then
echo 'Container must be run as root to set $NB_GID'
fi
fi
# Warn if looks like user want to run in sudo mode but hasn't run
# the container as root.
if [[ "$GRANT_SUDO" == "1" || "$GRANT_SUDO" == 'yes' ]]; then
echo 'Container must be run as root to grant sudo permissions'
fi
# Execute the command
run-hooks /usr/local/bin/before-notebook.d
echo "Executing the command: ${cmd[@]}"
exec "${cmd[@]}"
fi

View File

@ -0,0 +1,16 @@
__pycache__/
.DS_Store
.eggs
.github
.gitignore
.ipynb_checkpoints/
.travis.yml
.vscode
*.pyc
build/
dist/
nbviewer.egg-info/
nbviewer/static/build/
nbviewer/static/components/
node_modules/
notebook-*/

View File

@ -0,0 +1,14 @@
[flake8]
# Ignore style and complexity
# E: style errors
# W: style warnings
# F401: module imported but unused
# F811: redefinition of unused `name` from line `N`
# F841: local variable assigned but never used
ignore = E, C, W, F401, F403, F811, F841, E402, I100, I101, D400
exclude =
helm-chart,
hooks,
setup.py,
statuspage,
versioneer.py

View File

@ -0,0 +1 @@
nbviewer/_version.py export-subst

View File

@ -0,0 +1,23 @@
__pycache__
.debug
.DS_Store
.pandoc
*.egg-info
*.pyc
*.swp
*.un~
*/static/components
\.ipynb_checkpoints
bin
build
dist
node_modules
screenshots
nbviewer/git_info.json
# ignore downloaded notebook sources
notebook-*
.eggs/
MANIFEST
package-lock.json
.vscode/
*.tgz

View File

@ -0,0 +1,21 @@
repos:
- repo: https://github.com/asottile/reorder_python_imports
rev: v1.3.5
hooks:
- id: reorder-python-imports
language_version: python3
- repo: https://github.com/ambv/black
rev: 18.9b0
hooks:
- id: black
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v2.1.0
hooks:
- id: end-of-file-fixer
- id: check-json
- id: check-yaml
exclude: ^helm-chart/nbviewer/templates/
- id: check-case-conflict
- id: check-executables-have-shebangs
- id: requirements-txt-fixer
- id: flake8

View File

@ -0,0 +1,50 @@
language: python
node_js:
- 6
python:
- 3.5
- 3.6
before_install:
- sudo apt-get update
- sudo apt-get install -qq libzmq3-dev pandoc libcurl4-gnutls-dev libmemcached-dev libgnutls28-dev
- pip install --upgrade setuptools pip
- pip install -r requirements-dev.txt
install:
- pip install --upgrade setuptools pip
- pip install -r requirements.txt
- pip install -e .
# run tests
script:
- invoke test
# list the jobs
jobs:
include:
- name: autoformatting check
python: 3.6
# NOTE: It does not suffice to override to: null, [], or [""]. Travis will
# fall back to the default if we do.
before_install: echo "Do nothing before install."
install: pip install pre-commit
script:
- pre-commit run --all-files
after_success: echo "Do nothing after success."
after_failure:
- |
echo "You can install pre-commit hooks to automatically run formatting"
echo "on each commit with:"
echo " pre-commit install"
echo "or you can run by hand on staged files with"
echo " pre-commit run"
echo "or after-the-fact on already committed files with"
echo " pre-commit run --all-files"
env:
global:
- secure: Sv53YMdsVTin1hUPRqIuvdAOJ0UwklEowW49qpxY9wSgiAM79D+e1b5Yxrn+RTtS3WGlvK1aKHICc+2ajccEJkKFL8WDy2SnTnoWPadrEy4NAGLkNMGK+bAYMnLNoNRbSGVz5JpvNJ7JkeaEplhJ572OJOxa1X7ZF9165ZbOWng=
- secure: ajFM7ch1/xYyEjusyTzd963GOOLg5/H0lxvQ7L6r+LBDDro79FxNPMcAkZxF7n24rkPO8I+AP3FfUwbQf4ShmGkAdsxSFMc2d7GDUowxiicPr5bMitygxlzl2ox2lWdpt4QldmEywbrCKKwt/cZkKxE8er9xBcwe7xw/2xUYOLk=

View File

@ -0,0 +1,58 @@
# Contributing to NBViewer
Welcome! As a [Jupyter](https://jupyter.org) project,
you can follow the [Jupyter contributor guide](https://jupyter.readthedocs.io/en/latest/contributor/content-contributor.html).
Make sure to also follow [Project Jupyter's Code of Conduct](https://github.com/jupyter/governance/blob/master/conduct/code_of_conduct.md)
for a friendly and welcoming collaborative environment.
## Setting up a development environment
See the instructions for local development or local installation first.
NBViewer has adopted automatic code formatting so you shouldn't
need to worry too much about your code style.
As long as your code is valid,
the pre-commit hook should take care of how it should look. Here is how to set up pre-commit hooks for automatic code formatting, etc.
```bash
pre-commit install
```
You can also invoke the pre-commit hook manually at any time with
```bash
pre-commit run
```
which should run any autoformatting on your code
and tell you about any errors it couldn't fix automatically.
You may also install [black integration](https://github.com/ambv/black#editor-integration)
into your text editor to format code automatically.
If you have already committed files before setting up the pre-commit
hook with `pre-commit install`, you can fix everything up using
`pre-commit run --all-files`. You need to make the fixing commit
yourself after that.
#### Running the Tests
It's a good idea to write tests to exercise any new features,
or that trigger any bugs that you have fixed to catch regressions. `nose` is used to run the test suite. The tests currently make calls to
external APIs such as GitHub, so it is best to use your Github API Token when
running:
```shell
$ cd <path to repo>
$ pip install -r requirements-dev.txt
$ GITHUB_API_TOKEN=<your token> python setup.py test
```
You can run the tests with:
```bash
nosetests -v
```
in the repo directory.

View File

@ -0,0 +1,58 @@
# Define a builder image
FROM python:3.7-buster as builder
ENV DEBIAN_FRONTEND=noninteractive
ENV LANG=C.UTF-8
RUN apt-get update \
&& apt-get install -yq --no-install-recommends \
ca-certificates \
libcurl4-gnutls-dev \
git \
nodejs \
npm
# Python requirements
COPY ./requirements-dev.txt /srv/nbviewer/
COPY ./requirements.txt /srv/nbviewer/
RUN python3 -mpip install -r /srv/nbviewer/requirements-dev.txt -r /srv/nbviewer/requirements.txt
WORKDIR /srv/nbviewer
# Copy source tree in
COPY . /srv/nbviewer
RUN python3 setup.py build && \
python3 -mpip wheel -vv . -w /wheels
# Now define the runtime image
FROM python:3.7-slim-buster
LABEL maintainer="Jupyter Project <jupyter@googlegroups.com>"
ENV DEBIAN_FRONTEND=noninteractive
ENV LANG=C.UTF-8
RUN apt-get update \
&& apt-get install -yq --no-install-recommends \
ca-certificates \
libcurl4 \
git \
&& apt-get clean && rm -rf /var/lib/apt/lists/*
COPY --from=builder /wheels /wheels
RUN python3 -mpip install --no-cache /wheels/*
# To change the number of threads use
# docker run -d -e NBVIEWER_THREADS=4 -p 80:8080 nbviewer
ENV NBVIEWER_THREADS 2
WORKDIR /srv/nbviewer
RUN mkdir -p /home/nobody/notes
EXPOSE 8080
USER nobody
EXPOSE 9000
ENTRYPOINT python -m nbviewer --port=8080 --localfiles=/home/nobody/notes
#CMD ["python", "-m", "nbviewer", "--port=8080", --localfiles=/home/nobody/notes]

View File

@ -0,0 +1,35 @@
=============================
The NbViewer licensing terms
=============================
NbViewer is licensed under the terms of the Modified BSD License (also known as
New or Revised BSD), as follows:
Copyright (c) 2012-2013, IPython Development Team
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
Redistributions in binary form must reproduce the above copyright notice, this
list of conditions and the following disclaimer in the documentation and/or
other materials provided with the distribution.
Neither the name of the IPython Development Team nor the names of its
contributors may be used to endorse or promote products derived from this
software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

View File

@ -0,0 +1,4 @@
include versioneer.py
include nbviewer/_version.py
include requirements.txt
include nbviewer/git_info.json

View File

@ -0,0 +1,237 @@
**[Quick Run](#quick-run)** |
**[GitHub Enterprise](#github-enterprise)** |
**[Base URL](#base-url)** |
**[Local Development](#local-development)** |
**[Contributing](#contributing)** |
**[Extensions](#extending-the-notebook-viewer)** |
**[Configuration](#config-file-and-command-line-configuration)** |
**[Security](#securing-the-notebook-viewer)**
# Jupyter Notebook Viewer
[![Latest PyPI version](https://img.shields.io/pypi/v/nbviewer?logo=pypi)](https://pypi.python.org/pypi/nbviewer)
[![TravisCI build status](https://img.shields.io/travis/jupyter/nbviewer/master?logo=travis)](https://travis-ci.org/jupyter/nbviewer)
[![GitHub](https://img.shields.io/badge/issue_tracking-github-blue?logo=github)](https://github.com/jupyter/nbviewer/issues)
[![Gitter](https://img.shields.io/badge/social_chat-gitter-blue?logo=gitter)](https://gitter.im/jupyter/nbviewer)
Jupyter NBViewer is the web application behind
[The Jupyter Notebook Viewer](http://nbviewer.jupyter.org),
which is graciously hosted by [OVHcloud](https://ovhcloud.com).
Run this locally to get most of the features of nbviewer on your own network.
If you need help using or installing Jupyter Notebook Viewer, please use the [jupyter/help](https://github.com/jupyter/help) issue tracker. If you would like to propose an enhancement to nbviewer or file a bug report, please [open an issue here, in the jupyter/nbviewer project](https://github.com/jupyter/nbviewer).
## Quick Run
If you have `docker` installed, you can pull and run the currently built version of the Docker container by
```shell
$ docker pull jupyter/nbviewer
$ docker run -p 8080:8080 jupyter/nbviewer
```
It automatically gets built with each push to `master`, so you'll always be able to get the freshest copy.
For speed and friendliness to GitHub, be sure to set `GITHUB_OAUTH_KEY` and `GITHUB_OAUTH_SECRET`:
```shell
$ docker run -p 8080:8080 -e 'GITHUB_OAUTH_KEY=YOURKEY' \
-e 'GITHUB_OAUTH_SECRET=YOURSECRET' \
jupyter/nbviewer
```
Or to use your GitHub personal access token, you can just set `GITHUB_API_TOKEN`.
## GitHub Enterprise
To use nbviewer on your own GitHub Enterprise instance you need to set `GITHUB_API_URL`.
The relevant [API endpoints for GitHub Enterprise](https://developer.github.com/v3/enterprise/) are prefixed with `http://hostname/api/v3`.
You must also specify your `OAUTH` or `API_TOKEN` as explained above. For example:
```shell
$ docker run -p 8080:8080 -e 'GITHUB_OAUTH_KEY=YOURKEY' \
-e 'GITHUB_OAUTH_SECRET=YOURSECRET' \
-e 'GITHUB_API_URL=https://ghe.example.com/api/v3/' \
jupyter/nbviewer
```
With this configured all GitHub API requests will go to your Enterprise instance so you can view all of your internal notebooks.
## Base URL
If the environment variable `JUPYTERHUB_SERVICE_PREFIX` is specified, then NBViewer _always_ uses the value of this environment variable as the base URL.
In the case that there is no value for `JUPYTERHUB_SERVICE_PREFIX`, then as a backup the value of the `--base-url` flag passed to the `python -m nbviewer` command on the command line will be used as the base URL.
## Local Development
### With Docker
You can build a docker image that uses your local branch.
#### Build
```shell
$ cd <path to repo>
$ docker build -t nbviewer .
```
#### Run
```shell
$ cd <path to repo>
$ docker run -p 8080:8080 nbviewer
```
### With Docker Compose
The Notebook Viewer uses `memcached` in production. To locally try out this
setup, a [docker-compose](https://docs.docker.com/compose/) configuration is
provided to easily start/stop the `nbviewer` and `memcached` containers
together from your current branch. You will need to install `docker` prior
to this.
#### Run
```shell
$ cd <path to repo>
$ pip install docker-compose
$ docker-compose up
```
### Local Installation
The Notebook Viewer requires several binary packages to be installed on your system. The primary ones are `libmemcached-dev libcurl4-openssl-dev pandoc libevent-dev libgnutls28-dev`. Package names may differ on your system, see [salt-states](https://github.com/rgbkrk/salt-states-nbviewer/blob/master/nbviewer/init.sls) for more details.
If they are installed, you can install the required Python packages via pip.
```shell
$ cd <path to repo>
$ pip install -r requirements.txt
```
#### Static Assets
Static assets are maintained with `bower` and `less` (which require having
`npm` installed), and the `invoke` python module.
```shell
$ cd <path to repo>
$ pip install -r requirements-dev.txt
$ npm install
$ invoke bower
$ invoke less [-d]
```
This will download the relevant assets into `nbviewer/static/components` and create the built assets in `nbviewer/static/build`.
Pass `-d` or `--debug` to `invoke less` to create a CSS sourcemap, useful for debugging.
#### Running Locally
```shell
$ cd <path to repo>
$ python -m nbviewer --debug --no-cache
```
This will automatically relaunch the server if a change is detected on a python file, and not cache any results. You can then just do the modifications you like to the source code and/or the templates then refresh the pages.
## Contributing
If you would like to contribute to the project, please read the [`CONTRIBUTING.md`](CONTRIBUTING.md). The `CONTRIBUTING.md` file
explains how to set up a development installation and how to run the test suite.
## Extending the Notebook Viewer
### Providers
Providers are sources of notebooks and directories of notebooks and directories.
`nbviewer` ships with several providers
- `url`
- `gist`
- `github`
- `local`
#### Writing a new Provider
There are already several providers
[proposed/requested](https://github.com/jupyter/nbviewer/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+label%3Atag%3AProvider). Some providers are more involved than others, and some,
such as those which would require user authentication, will take some work to
support properly.
A provider is implemented as a python module, which can expose a few functions:
##### `uri_rewrites`
If you just need to rewrite URLs (or URIs) of another site/namespace, implement
`uri_rewrites`, which will allow the front page to transform an arbitrary string
(usually an URI fragment), escape it correctly, and turn it into a "canonical"
nbviewer URL. See the [dropbox provider](./nbviewer/providers/dropbox/handlers.py)
for a simple example of rewriting URLs without using a custom API client.
##### `default_handlers`
If you need custom logic, such as connecting to an API, implement
`default_handlers`. See the [github provider](./nbviewer/providers/github/handlers.py)
for a complex example of providing multiple handlers.
##### Error Handling
While you _could_ re-implement upstream HTTP error handling, a small
convenience method is provided for intercepting HTTP errors.
On a given URL handler that inherits from `BaseHandler`, overload the
`client_error_message` and re-call it with your message (or `None`). See the
[gist provider](./nbviewer/providers/gist/handlers.py) for an example of customizing the
error message.
### Formats
Formats are ways to present notebooks to the user.
`nbviewer` ships with three providers:
- `html`
- `slides`
- `script`
#### Writing a new Format
If you'd like to write a new format, open a ticket, or speak up on [gitter](https://gitter.im/jupyter/nbviewer)!
We have some work yet to do to support your next big thing in notebook
publishing, and we'd love to hear from you.
## Config File and Command Line Configuration
NBViewer is configurable using a config file, by default called `nbviewer_config.py`. You can modify the name and location of the config file that NBViewer looks for using the `--config-file` command line flag. (The location is always a relative path, i.e. relative to where the command `python -m nbviewer` is run, and never an absolute path.)
If you don't know which attributes of NBViewer you can configure using the config file, run `python -m nbviewer --generate-config` (or `python -m nbviewer --generate-config --config-file="my_custom_name.py"`) to write a default config file which has all of the configurable options commented out and set to their default values. To change a configurable option to a new value, uncomment the corresponding line and change the default value to the new value.
You can also run `python -m nbviewer --help-all` to see all of the configurable options. This is a more comprehensive version of `python -m nbviewer --help`, which gives a list of the most common ones along with flags and aliases you can use to set their values temporarily via the command line.
The config file uses [the standard configuration syntax for Jupyter projects](https://traitlets.readthedocs.io/en/stable/config.html). For example, to configure the default port used to be 9000, add the line `c.NBViewer.port = 9000` to the config file. If you want to do this just once, you can also run `python -m nbviewer --NBViewer.port=9000` at the command line. (`NBViewer.port` also has the alias `port`, making it also possible to do, in this specific case, `python -m nbviewer --port=9000`. However not all configurable options have shorthand aliases like this; you can check using the outputs of `python -m nbviewer --help` and `python -m nbviewer --help-all` to see which ones do and which ones don't.)
One thing this allows you to do, for example, is to write your custom implementations of any of the standard page rendering [handlers](https://www.tornadoweb.org/en/stable/guide/structure.html#subclassing-requesthandler) included in NBViewer, e.g. by subclassing the original handlers to include custom logic along with custom output possibilities, and then have these custom handlers always loaded by default, by modifying the corresponding lines in the config file. This is effectively another way to extend NBViewer.
## Securing the Notebook Viewer
You can run the viewer as a [JupyterHub 0.7+ service](https://jupyterhub.readthedocs.io/en/latest/reference/services.html). Running the viewer as a service prevents users who have not authenticated with the Hub from accessing the nbviewer instance. This setup can be useful for protecting access to local notebooks rendered with the `--localfiles` option.
Add an entry like the following to your `jupyterhub_config.py` to have it start nbviewer as a managed service:
```python
c.JupyterHub.services = [
{
# the /services/<name> path for accessing the notebook viewer
'name': 'nbviewer',
# the interface and port nbviewer will use
'url': 'http://127.0.0.1:9000',
# the path to nbviewer repo
'cwd': '<path to repo>',
# command to start the nbviewer
'command': ['python', '-m', 'nbviewer']
}
]
```
The nbviewer instance will automatically read the [various `JUPYTERHUB_*` environment variables](http://jupyterhub.readthedocs.io/en/latest/reference/services.html#launching-a-hub-managed-service) and configure itself accordingly. You can also run the nbviewer instance as an [externally managed JupyterHub service](http://jupyterhub.readthedocs.io/en/latest/reference/services.html#externally-managed-services), but must set the requisite environment variables yourself.

View File

@ -0,0 +1,8 @@
nbviewer:
build: .
links:
- nbcache
ports:
- 8080:8080
nbcache:
image: memcached

View File

@ -0,0 +1,17 @@
service:
type: NodePort
ports:
nodePort: 32567
resources:
requests:
memory: null
cpu: null
nbviewer:
extraArgs:
- "--logging=debug"
memcached:
replicaCount: 1
pdbMinAvailable: 0

View File

@ -0,0 +1,10 @@
apiVersion: v1
name: nbviewer
version: 0.0.1
appVersion: 1.0.1
description: Jupyter Notebook Viewer
home: https://nbviewer.jupyter.org
sources:
- https://github.com/jupyter/nbviewer
kubeVersion: '>=1.11.0-0'
tillerVersion: '>=2.11.0-0'

View File

@ -0,0 +1,6 @@
dependencies:
- name: memcached
repository: https://kubernetes-charts.storage.googleapis.com
version: 3.2.2
digest: sha256:b0f92f7e3f8bfeb286cf8566d86c9c795a2712d7c690bdf66eb037dbae7b9036
generated: "2020-03-03T10:17:51.050357+01:00"

View File

@ -0,0 +1,4 @@
dependencies:
- name: memcached
version: 3.2.2
repository: https://kubernetes-charts.storage.googleapis.com

View File

@ -0,0 +1,48 @@
{{/*
Expand the name of the chart.
*/}}
{{- define "nbviewer.name" -}}
{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" -}}
{{- end -}}
{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
*/}}
{{- define "nbviewer.fullname" -}}
{{- if .Values.fullnameOverride -}}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" -}}
{{- else -}}
{{- $name := default .Chart.Name .Values.nameOverride -}}
{{- if contains $name .Release.Name -}}
{{- .Release.Name | trunc 63 | trimSuffix "-" -}}
{{- else -}}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" -}}
{{- end -}}
{{- end -}}
{{- end -}}
{{/*
Common labels
*/}}
{{- define "nbviewer.labels" -}}
app.kubernetes.io/name: {{ include "nbviewer.name" . }}
helm.sh/chart: {{ include "nbviewer.chart" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end -}}
{{- define "nbviewer.matchLabels" -}}
app.kubernetes.io/name: {{ include "nbviewer.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end -}}
{{/*
Create chart name and version as used by the chart label.
*/}}
{{- define "nbviewer.chart" -}}
{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" -}}
{{- end -}}

View File

@ -0,0 +1,149 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ template "nbviewer.fullname" . }}
labels:
component: nbviewer
{{- include "nbviewer.labels" . | nindent 4 }}
spec:
replicas: {{ .Values.replicas }}
selector:
matchLabels:
component: nbviewer
{{- include "nbviewer.matchLabels" . | nindent 6 }}
{{- if .Values.deploymentStrategy }}
strategy:
{{- .Values.deploymentStrategy | toYaml | trimSuffix "\n" | nindent 4 }}
{{- end }}
template:
metadata:
labels:
component: nbviewer
{{- include "nbviewer.matchLabels" . | nindent 8 }}
annotations:
# This lets us autorestart when the secret changes!
checksum/secret: {{ include (print .Template.BasePath "/secret.yaml") . | sha256sum }}
{{- if .Values.annotations }}
{{- .Values.annotations | toYaml | trimSuffix "\n" | nindent 8 }}
{{- end }}
spec:
nodeSelector: {{ toJson .Values.nodeSelector }}
volumes:
- name: secret
secret:
secretName: {{ template "nbviewer.fullname" . }}
items:
- key: newrelic-ini
path: newrelic.ini
{{- if .Values.extraVolumes }}
{{- .Values.extraVolumes | toYaml | trimSuffix "\n" | nindent 8 }}
{{- end }}
{{- if .Values.initContainers }}
initContainers:
{{- .Values.initContainers | toYaml | trimSuffix "\n" | nindent 8 }}
{{- end }}
containers:
{{- if .Values.extraContainers }}
{{- .Values.extraContainers | toYaml | trimSuffix "\n" | nindent 8 }}
{{- end }}
- name: nbviewer
image: {{ .Values.image }}
command:
{{- if .Values.nbviewer.newrelicIni }}
- newrelic-admin
- run-python
{{- else }}
- python3
{{- end }}
- "-m"
- nbviewer
- --port=5000
{{- if .Values.nbviewer.extraArgs }}
{{- .Values.nbviewer.extraArgs | toYaml | trimSuffix "\n" | nindent 12 }}
{{- end }}
volumeMounts:
{{- if .Values.nbviewer.newrelicIni }}
- mountPath: /etc/nbviewer/newrelic.ini
name: secret
subPath: newrelic.ini
{{- end }}
# - mountPath: /etc/nbviewer/values.json
# subPath: values.json
# name: values
{{- if .Values.extraVolumeMounts }}
{{- .Values.extraVolumeMounts | toYaml | trimSuffix "\n" | nindent 12 }}
{{- end }}
resources:
{{- .Values.resources | toYaml | trimSuffix "\n" | nindent 12 }}
{{- with .Values.imagePullPolicy }}
imagePullPolicy: {{ . }}
{{- end }}
env:
- name: PYTHONUNBUFFERED
value: "1"
- name: HELM_RELEASE_NAME
value: {{ .Release.Name | quote }}
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
{{- if .Values.github.clientId }}
- name: GITHUB_OAUTH_KEY
valueFrom:
secretKeyRef:
name: {{ template "nbviewer.fullname" . }}
key: github-clientId
{{- end }}
{{- if .Values.github.clientSecret }}
- name: GITHUB_OAUTH_SECRET
valueFrom:
secretKeyRef:
name: {{ template "nbviewer.fullname" . }}
key: github-clientSecret
{{- end }}
{{- if .Values.github.accessToken }}
- name: GITHUB_API_TOKEN
valueFrom:
secretKeyRef:
name: {{ template "nbviewer.fullname" . }}
key: github-accessToken
{{- end }}
- name: MEMCACHIER_SERVERS
value: {{ .Release.Name }}-memcached:11211
- name: NEW_RELIC_CONFIG_FILE
value: /etc/nbviewer/newrelic.ini
{{- if .Values.extraEnv }}
{{- range $key, $value := .Values.extraEnv }}
- name: {{ $key | quote }}
value: {{ $value | quote }}
{{- end }}
{{- end }}
ports:
- containerPort: 5000
name: nbviewer
{{- if .Values.livenessProbe.enabled }}
# livenessProbe notes:
# We don't know how long hub database upgrades could take
# so having a liveness probe could be a bit risky unless we put
# a initialDelaySeconds value with long enough margin for that
# to not be an issue. If it is too short, we could end up aborting
# database upgrades midway or ending up in an infinite restart
# loop.
livenessProbe:
initialDelaySeconds: {{ .Values.livenessProbe.initialDelaySeconds }}
periodSeconds: {{ .Values.livenessProbe.periodSeconds }}
httpGet:
path: {{ .Values.nbviewer.baseUrl | trimSuffix "/" | quote }}
port: nbviewer
{{- end }}
{{- if .Values.readinessProbe.enabled }}
readinessProbe:
initialDelaySeconds: {{ .Values.readinessProbe.initialDelaySeconds }}
periodSeconds: {{ .Values.readinessProbe.periodSeconds }}
httpGet:
path: {{ .Values.nbviewer.baseUrl | trimSuffix "/" | quote }}
port: nbviewer
{{- end }}

View File

@ -0,0 +1,13 @@
{{- if .Values.pdb.enabled -}}
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: {{ template "nbviewer.fullname" . }}
labels:
{{- include "nbviewer.labels" . | nindent 4 }}
spec:
minAvailable: {{ .Values.pdb.minAvailable }}
selector:
matchLabels:
{{- include "nbviewer.matchLabels" . | nindent 6 }}
{{- end }}

View File

@ -0,0 +1,13 @@
kind: Secret
apiVersion: v1
metadata:
name: {{ template "nbviewer.fullname" . }}
labels:
{{- include "nbviewer.labels" . | nindent 4 }}
type: Opaque
data:
github-accessToken: {{ .Values.github.accessToken | b64enc | quote }}
github-clientId: {{ .Values.github.clientId | b64enc | quote }}
github-clientSecret: {{ .Values.github.clientSecret | b64enc | quote }}
statuspage-apiKey: {{ .Values.statuspage.apiKey | b64enc | quote }}
newrelic-ini: {{ .Values.nbviewer.newrelicIni | b64enc | quote }}

View File

@ -0,0 +1,24 @@
apiVersion: v1
kind: Service
metadata:
name: {{ template "nbviewer.fullname" . }}
labels:
{{- include "nbviewer.labels" . | nindent 4 }}
annotations:
{{- if .Values.service.annotations }}
{{- .Values.service.annotations | toYaml | nindent 4 }}
{{- end }}
spec:
type: {{ .Values.service.type }}
{{- if .Values.service.loadBalancerIP }}
loadBalancerIP: {{ .Values.service.loadBalancerIP }}
{{- end }}
selector:
{{- include "nbviewer.matchLabels" . | nindent 4 }}
ports:
- protocol: TCP
port: 80
targetPort: 5000
{{- if .Values.service.ports.nodePort }}
nodePort: {{ .Values.service.ports.nodePort }}
{{- end }}

View File

@ -0,0 +1,75 @@
{{- if .Values.statuspage.enabled -}}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ template "nbviewer.fullname" . }}-statuspage
labels:
component: statuspage
{{- include "nbviewer.labels" . | nindent 4 }}
spec:
replicas: 1
selector:
matchLabels:
component: statuspage
{{- include "nbviewer.matchLabels" . | nindent 6 }}
template:
metadata:
labels:
component: statuspage
{{- include "nbviewer.matchLabels" . | nindent 8 }}
annotations:
# This lets us autorestart when the secret changes!
checksum/secret: {{ include (print .Template.BasePath "/secret.yaml") . | sha256sum }}
{{- if .Values.annotations }}
{{- .Values.annotations | toYaml | trimSuffix "\n" | nindent 8 }}
{{- end }}
spec:
nodeSelector: {{ toJson .Values.nodeSelector }}
volumes:
- name: secret
secret:
secretName: {{ template "nbviewer.fullname" . }}
containers:
- name: statuspage
image: {{ .Values.statuspage.image }}
resources:
{{- .Values.statuspage.resources | toYaml | trimSuffix "\n" | nindent 12 }}
{{- with .Values.imagePullPolicy }}
imagePullPolicy: {{ . }}
{{- end }}
env:
- name: PYTHONUNBUFFERED
value: "1"
{{- if .Values.github.clientId }}
- name: GITHUB_OAUTH_KEY
valueFrom:
secretKeyRef:
name: {{ template "nbviewer.fullname" . }}
key: github-clientId
{{- end }}
{{- if .Values.github.clientSecret }}
- name: GITHUB_OAUTH_SECRET
valueFrom:
secretKeyRef:
name: {{ template "nbviewer.fullname" . }}
key: github-clientSecret
{{- end }}
{{- if .Values.github.accessToken }}
- name: GITHUB_API_TOKEN
valueFrom:
secretKeyRef:
name: {{ template "nbviewer.fullname" . }}
key: github-accessToken
{{- end }}
- name: STATUSPAGE_API_KEY
valueFrom:
secretKeyRef:
name: {{ template "nbviewer.fullname" . }}
key: statuspage-apiKey
- name: STATUSPAGE_PAGE_ID
value: {{ .Values.statuspage.pageId }}
- name: STATUSPAGE_METRIC_ID
value: {{ .Values.statuspage.metricId }}
{{- end -}}

View File

@ -0,0 +1,63 @@
image: "jupyter/nbviewer"
imagePullPolicy: null
service: {}
nodeSelector: null
pdbMinAvailable: 2
pdb:
minAvailable: 1
resources:
requests:
memory: 256M
cpu: "1"
annotations: {}
extraContainers: null
extraVolumes: null
livenessProbe:
enabled: true
initialDelaySeconds: 5
periodSeconds: 10
readinessProbe:
enabled: true
initialDelaySeconds: 5
periodSeconds: 10
service:
type: LoadBalancer
ports:
nodePort: null
github:
accessToken: ""
clientId: ""
clientSecret: ""
nbviewer:
baseUrl: "/"
extraArgs: []
newrelicIni: ""
memcached:
AntiAffinity: soft
replicaCount: 2
pdbMinAvailable: 1
updateStrategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
statuspage:
enabled: false
image: jupyter/nbviewer-statuspage
resources: null
apiKey: ""
pageId: ""
metricId: ""

View File

@ -0,0 +1,4 @@
#!/bin/bash
SHA=${SOURCE_COMMIT::8}
docker tag $DOCKER_REPO:$DOCKER_TAG $DOCKER_REPO:$SHA
docker push $DOCKER_REPO:$SHA

View File

@ -0,0 +1,4 @@
from ._version import get_versions
__version__ = get_versions()["version"]
del get_versions

View File

@ -0,0 +1,3 @@
from nbviewer.app import main
main()

View File

@ -0,0 +1,553 @@
# This file helps to compute a version number in source trees obtained from
# git-archive tarball (such as those provided by githubs download-from-tag
# feature). Distribution tarballs (built by setup.py sdist) and build
# directories (produced by setup.py build) will contain a much shorter file
# that just contains the computed version number.
# This file is released into the public domain. Generated by
# versioneer-0.18 (https://github.com/warner/python-versioneer)
"""Git implementation of _version.py."""
import errno
import os
import re
import subprocess
import sys
def get_keywords():
"""Get the keywords needed to look up the version information."""
# these strings will be replaced by git during git-archive.
# setup.py/versioneer.py will grep for the variable names, so they must
# each be defined on a line of their own. _version.py will just call
# get_keywords().
git_refnames = "$Format:%d$"
git_full = "$Format:%H$"
git_date = "$Format:%ci$"
keywords = {"refnames": git_refnames, "full": git_full, "date": git_date}
return keywords
class VersioneerConfig:
"""Container for Versioneer configuration parameters."""
def get_config():
"""Create, populate and return the VersioneerConfig() object."""
# these strings are filled in when 'setup.py versioneer' creates
# _version.py
cfg = VersioneerConfig()
cfg.VCS = "git"
cfg.style = "pep440"
cfg.tag_prefix = ""
cfg.parentdir_prefix = "None"
cfg.versionfile_source = "nbviewer/_version.py"
cfg.verbose = False
return cfg
class NotThisMethod(Exception):
"""Exception raised if a method is not valid for the current scenario."""
LONG_VERSION_PY = {}
HANDLERS = {}
def register_vcs_handler(vcs, method): # decorator
"""Decorator to mark a method as the handler for a particular VCS."""
def decorate(f):
"""Store f in HANDLERS[vcs][method]."""
if vcs not in HANDLERS:
HANDLERS[vcs] = {}
HANDLERS[vcs][method] = f
return f
return decorate
def run_command(commands, args, cwd=None, verbose=False, hide_stderr=False, env=None):
"""Call the given command(s)."""
assert isinstance(commands, list)
p = None
for c in commands:
try:
dispcmd = str([c] + args)
# remember shell=False, so use git.cmd on windows, not just git
p = subprocess.Popen(
[c] + args,
cwd=cwd,
env=env,
stdout=subprocess.PIPE,
stderr=(subprocess.PIPE if hide_stderr else None),
)
break
except EnvironmentError:
e = sys.exc_info()[1]
if e.errno == errno.ENOENT:
continue
if verbose:
print("unable to run %s" % dispcmd)
print(e)
return None, None
else:
if verbose:
print("unable to find command, tried %s" % (commands,))
return None, None
stdout = p.communicate()[0].strip()
if sys.version_info[0] >= 3:
stdout = stdout.decode()
if p.returncode != 0:
if verbose:
print("unable to run %s (error)" % dispcmd)
print("stdout was %s" % stdout)
return None, p.returncode
return stdout, p.returncode
def versions_from_parentdir(parentdir_prefix, root, verbose):
"""Try to determine the version from the parent directory name.
Source tarballs conventionally unpack into a directory that includes both
the project name and a version string. We will also support searching up
two directory levels for an appropriately named parent directory
"""
rootdirs = []
for i in range(3):
dirname = os.path.basename(root)
if dirname.startswith(parentdir_prefix):
return {
"version": dirname[len(parentdir_prefix) :],
"full-revisionid": None,
"dirty": False,
"error": None,
"date": None,
}
else:
rootdirs.append(root)
root = os.path.dirname(root) # up a level
if verbose:
print(
"Tried directories %s but none started with prefix %s"
% (str(rootdirs), parentdir_prefix)
)
raise NotThisMethod("rootdir doesn't start with parentdir_prefix")
@register_vcs_handler("git", "get_keywords")
def git_get_keywords(versionfile_abs):
"""Extract version information from the given file."""
# the code embedded in _version.py can just fetch the value of these
# keywords. When used from setup.py, we don't want to import _version.py,
# so we do it with a regexp instead. This function is not used from
# _version.py.
keywords = {}
try:
f = open(versionfile_abs, "r")
for line in f.readlines():
if line.strip().startswith("git_refnames ="):
mo = re.search(r'=\s*"(.*)"', line)
if mo:
keywords["refnames"] = mo.group(1)
if line.strip().startswith("git_full ="):
mo = re.search(r'=\s*"(.*)"', line)
if mo:
keywords["full"] = mo.group(1)
if line.strip().startswith("git_date ="):
mo = re.search(r'=\s*"(.*)"', line)
if mo:
keywords["date"] = mo.group(1)
f.close()
except EnvironmentError:
pass
return keywords
@register_vcs_handler("git", "keywords")
def git_versions_from_keywords(keywords, tag_prefix, verbose):
"""Get version information from git keywords."""
if not keywords:
raise NotThisMethod("no keywords at all, weird")
date = keywords.get("date")
if date is not None:
# git-2.2.0 added "%cI", which expands to an ISO-8601 -compliant
# datestamp. However we prefer "%ci" (which expands to an "ISO-8601
# -like" string, which we must then edit to make compliant), because
# it's been around since git-1.5.3, and it's too difficult to
# discover which version we're using, or to work around using an
# older one.
date = date.strip().replace(" ", "T", 1).replace(" ", "", 1)
refnames = keywords["refnames"].strip()
if refnames.startswith("$Format"):
if verbose:
print("keywords are unexpanded, not using")
raise NotThisMethod("unexpanded keywords, not a git-archive tarball")
refs = set([r.strip() for r in refnames.strip("()").split(",")])
# starting in git-1.8.3, tags are listed as "tag: foo-1.0" instead of
# just "foo-1.0". If we see a "tag: " prefix, prefer those.
TAG = "tag: "
tags = set([r[len(TAG) :] for r in refs if r.startswith(TAG)])
if not tags:
# Either we're using git < 1.8.3, or there really are no tags. We use
# a heuristic: assume all version tags have a digit. The old git %d
# expansion behaves like git log --decorate=short and strips out the
# refs/heads/ and refs/tags/ prefixes that would let us distinguish
# between branches and tags. By ignoring refnames without digits, we
# filter out many common branch names like "release" and
# "stabilization", as well as "HEAD" and "master".
tags = set([r for r in refs if re.search(r"\d", r)])
if verbose:
print("discarding '%s', no digits" % ",".join(refs - tags))
if verbose:
print("likely tags: %s" % ",".join(sorted(tags)))
for ref in sorted(tags):
# sorting will prefer e.g. "2.0" over "2.0rc1"
if ref.startswith(tag_prefix):
r = ref[len(tag_prefix) :]
if verbose:
print("picking %s" % r)
return {
"version": r,
"full-revisionid": keywords["full"].strip(),
"dirty": False,
"error": None,
"date": date,
}
# no suitable tags, so version is "0+unknown", but full hex is still there
if verbose:
print("no suitable tags, using unknown + full revision id")
return {
"version": "0+unknown",
"full-revisionid": keywords["full"].strip(),
"dirty": False,
"error": "no suitable tags",
"date": None,
}
@register_vcs_handler("git", "pieces_from_vcs")
def git_pieces_from_vcs(tag_prefix, root, verbose, run_command=run_command):
"""Get version from 'git describe' in the root of the source tree.
This only gets called if the git-archive 'subst' keywords were *not*
expanded, and _version.py hasn't already been rewritten with a short
version string, meaning we're inside a checked out source tree.
"""
GITS = ["git"]
if sys.platform == "win32":
GITS = ["git.cmd", "git.exe"]
out, rc = run_command(GITS, ["rev-parse", "--git-dir"], cwd=root, hide_stderr=True)
if rc != 0:
if verbose:
print("Directory %s not under git control" % root)
raise NotThisMethod("'git rev-parse --git-dir' returned error")
# if there is a tag matching tag_prefix, this yields TAG-NUM-gHEX[-dirty]
# if there isn't one, this yields HEX[-dirty] (no NUM)
describe_out, rc = run_command(
GITS,
[
"describe",
"--tags",
"--dirty",
"--always",
"--long",
"--match",
"%s*" % tag_prefix,
],
cwd=root,
)
# --long was added in git-1.5.5
if describe_out is None:
raise NotThisMethod("'git describe' failed")
describe_out = describe_out.strip()
full_out, rc = run_command(GITS, ["rev-parse", "HEAD"], cwd=root)
if full_out is None:
raise NotThisMethod("'git rev-parse' failed")
full_out = full_out.strip()
pieces = {}
pieces["long"] = full_out
pieces["short"] = full_out[:7] # maybe improved later
pieces["error"] = None
# parse describe_out. It will be like TAG-NUM-gHEX[-dirty] or HEX[-dirty]
# TAG might have hyphens.
git_describe = describe_out
# look for -dirty suffix
dirty = git_describe.endswith("-dirty")
pieces["dirty"] = dirty
if dirty:
git_describe = git_describe[: git_describe.rindex("-dirty")]
# now we have TAG-NUM-gHEX or HEX
if "-" in git_describe:
# TAG-NUM-gHEX
mo = re.search(r"^(.+)-(\d+)-g([0-9a-f]+)$", git_describe)
if not mo:
# unparseable. Maybe git-describe is misbehaving?
pieces["error"] = "unable to parse git-describe output: '%s'" % describe_out
return pieces
# tag
full_tag = mo.group(1)
if not full_tag.startswith(tag_prefix):
if verbose:
fmt = "tag '%s' doesn't start with prefix '%s'"
print(fmt % (full_tag, tag_prefix))
pieces["error"] = "tag '%s' doesn't start with prefix '%s'" % (
full_tag,
tag_prefix,
)
return pieces
pieces["closest-tag"] = full_tag[len(tag_prefix) :]
# distance: number of commits since tag
pieces["distance"] = int(mo.group(2))
# commit: short hex revision ID
pieces["short"] = mo.group(3)
else:
# HEX: no tags
pieces["closest-tag"] = None
count_out, rc = run_command(GITS, ["rev-list", "HEAD", "--count"], cwd=root)
pieces["distance"] = int(count_out) # total number of commits
# commit date: see ISO-8601 comment in git_versions_from_keywords()
date = run_command(GITS, ["show", "-s", "--format=%ci", "HEAD"], cwd=root)[
0
].strip()
pieces["date"] = date.strip().replace(" ", "T", 1).replace(" ", "", 1)
return pieces
def plus_or_dot(pieces):
"""Return a + if we don't already have one, else return a ."""
if "+" in pieces.get("closest-tag", ""):
return "."
return "+"
def render_pep440(pieces):
"""Build up version string, with post-release "local version identifier".
Our goal: TAG[+DISTANCE.gHEX[.dirty]] . Note that if you
get a tagged build and then dirty it, you'll get TAG+0.gHEX.dirty
Exceptions:
1: no tags. git_describe was just HEX. 0+untagged.DISTANCE.gHEX[.dirty]
"""
if pieces["closest-tag"]:
rendered = pieces["closest-tag"]
if pieces["distance"] or pieces["dirty"]:
rendered += plus_or_dot(pieces)
rendered += "%d.g%s" % (pieces["distance"], pieces["short"])
if pieces["dirty"]:
rendered += ".dirty"
else:
# exception #1
rendered = "0+untagged.%d.g%s" % (pieces["distance"], pieces["short"])
if pieces["dirty"]:
rendered += ".dirty"
return rendered
def render_pep440_pre(pieces):
"""TAG[.post.devDISTANCE] -- No -dirty.
Exceptions:
1: no tags. 0.post.devDISTANCE
"""
if pieces["closest-tag"]:
rendered = pieces["closest-tag"]
if pieces["distance"]:
rendered += ".post.dev%d" % pieces["distance"]
else:
# exception #1
rendered = "0.post.dev%d" % pieces["distance"]
return rendered
def render_pep440_post(pieces):
"""TAG[.postDISTANCE[.dev0]+gHEX] .
The ".dev0" means dirty. Note that .dev0 sorts backwards
(a dirty tree will appear "older" than the corresponding clean one),
but you shouldn't be releasing software with -dirty anyways.
Exceptions:
1: no tags. 0.postDISTANCE[.dev0]
"""
if pieces["closest-tag"]:
rendered = pieces["closest-tag"]
if pieces["distance"] or pieces["dirty"]:
rendered += ".post%d" % pieces["distance"]
if pieces["dirty"]:
rendered += ".dev0"
rendered += plus_or_dot(pieces)
rendered += "g%s" % pieces["short"]
else:
# exception #1
rendered = "0.post%d" % pieces["distance"]
if pieces["dirty"]:
rendered += ".dev0"
rendered += "+g%s" % pieces["short"]
return rendered
def render_pep440_old(pieces):
"""TAG[.postDISTANCE[.dev0]] .
The ".dev0" means dirty.
Eexceptions:
1: no tags. 0.postDISTANCE[.dev0]
"""
if pieces["closest-tag"]:
rendered = pieces["closest-tag"]
if pieces["distance"] or pieces["dirty"]:
rendered += ".post%d" % pieces["distance"]
if pieces["dirty"]:
rendered += ".dev0"
else:
# exception #1
rendered = "0.post%d" % pieces["distance"]
if pieces["dirty"]:
rendered += ".dev0"
return rendered
def render_git_describe(pieces):
"""TAG[-DISTANCE-gHEX][-dirty].
Like 'git describe --tags --dirty --always'.
Exceptions:
1: no tags. HEX[-dirty] (note: no 'g' prefix)
"""
if pieces["closest-tag"]:
rendered = pieces["closest-tag"]
if pieces["distance"]:
rendered += "-%d-g%s" % (pieces["distance"], pieces["short"])
else:
# exception #1
rendered = pieces["short"]
if pieces["dirty"]:
rendered += "-dirty"
return rendered
def render_git_describe_long(pieces):
"""TAG-DISTANCE-gHEX[-dirty].
Like 'git describe --tags --dirty --always -long'.
The distance/hash is unconditional.
Exceptions:
1: no tags. HEX[-dirty] (note: no 'g' prefix)
"""
if pieces["closest-tag"]:
rendered = pieces["closest-tag"]
rendered += "-%d-g%s" % (pieces["distance"], pieces["short"])
else:
# exception #1
rendered = pieces["short"]
if pieces["dirty"]:
rendered += "-dirty"
return rendered
def render(pieces, style):
"""Render the given version pieces into the requested style."""
if pieces["error"]:
return {
"version": "unknown",
"full-revisionid": pieces.get("long"),
"dirty": None,
"error": pieces["error"],
"date": None,
}
if not style or style == "default":
style = "pep440" # the default
if style == "pep440":
rendered = render_pep440(pieces)
elif style == "pep440-pre":
rendered = render_pep440_pre(pieces)
elif style == "pep440-post":
rendered = render_pep440_post(pieces)
elif style == "pep440-old":
rendered = render_pep440_old(pieces)
elif style == "git-describe":
rendered = render_git_describe(pieces)
elif style == "git-describe-long":
rendered = render_git_describe_long(pieces)
else:
raise ValueError("unknown style '%s'" % style)
return {
"version": rendered,
"full-revisionid": pieces["long"],
"dirty": pieces["dirty"],
"error": None,
"date": pieces.get("date"),
}
def get_versions():
"""Get version information or return default if unable to do so."""
# I am in _version.py, which lives at ROOT/VERSIONFILE_SOURCE. If we have
# __file__, we can work backwards from there to the root. Some
# py2exe/bbfreeze/non-CPython implementations don't do __file__, in which
# case we can only use expanded keywords.
cfg = get_config()
verbose = cfg.verbose
try:
return git_versions_from_keywords(get_keywords(), cfg.tag_prefix, verbose)
except NotThisMethod:
pass
try:
root = os.path.realpath(__file__)
# versionfile_source is the relative path from the top of the source
# tree (where the .git directory might live) to this file. Invert
# this to find the root from __file__.
for i in cfg.versionfile_source.split("/"):
root = os.path.dirname(root)
except NameError:
return {
"version": "0+unknown",
"full-revisionid": None,
"dirty": None,
"error": "unable to find root of source tree",
"date": None,
}
try:
pieces = git_pieces_from_vcs(cfg.tag_prefix, root, verbose)
return render(pieces, cfg.style)
except NotThisMethod:
pass
try:
if cfg.parentdir_prefix:
return versions_from_parentdir(cfg.parentdir_prefix, root, verbose)
except NotThisMethod:
pass
return {
"version": "0+unknown",
"full-revisionid": None,
"dirty": None,
"error": "unable to compute version",
"date": None,
}

View File

@ -0,0 +1,807 @@
# -----------------------------------------------------------------------------
# Copyright (C) Jupyter Development Team
#
# Distributed under the terms of the BSD License. The full license is in
# the file COPYING, distributed as part of this software.
# -----------------------------------------------------------------------------
import io
import json
import logging
import os
from concurrent.futures import ProcessPoolExecutor
from concurrent.futures import ThreadPoolExecutor
from html import escape
from urllib.parse import urlparse
import markdown
from jinja2 import Environment
from jinja2 import FileSystemLoader
from nbconvert.exporters.export import exporter_map
from tornado import httpserver
from tornado import ioloop
from tornado import web
from tornado.curl_httpclient import curl_log
from tornado.log import access_log
from tornado.log import app_log
from tornado.log import LogFormatter
from traitlets import Any
from traitlets import Bool
from traitlets import default
from traitlets import Dict
from traitlets import Int
from traitlets import List
from traitlets import Set
from traitlets import Unicode
from traitlets.config import Application
from .cache import AsyncMultipartMemcache
from .cache import DummyAsyncCache
from .cache import MockCache
from .cache import pylibmc
from .client import NBViewerAsyncHTTPClient as HTTPClientClass
from .formats import default_formats
from .handlers import init_handlers
from .index import NoSearch
from .log import log_request
from .providers import default_providers
from .providers import default_rewrites
from .ratelimit import RateLimiter
from .utils import git_info
from .utils import jupyter_info
from .utils import url_path_join
try: # Python 3.8
from functools import cached_property
except ImportError:
from .utils import cached_property
from jupyter_server.base.handlers import FileFindHandler as StaticFileHandler
# -----------------------------------------------------------------------------
# Code
# -----------------------------------------------------------------------------
here = os.path.dirname(__file__)
pjoin = os.path.join
def nrhead():
try:
import newrelic.agent
except ImportError:
return ""
return newrelic.agent.get_browser_timing_header()
def nrfoot():
try:
import newrelic.agent
except ImportError:
return ""
return newrelic.agent.get_browser_timing_footer()
this_dir, this_filename = os.path.split(__file__)
FRONTPAGE_JSON = os.path.join(this_dir, "frontpage.json")
class NBViewer(Application):
name = Unicode("NBViewer")
aliases = Dict(
{
"base-url": "NBViewer.base_url",
"binder-base-url": "NBViewer.binder_base_url",
"cache-expiry-max": "NBViewer.cache_expiry_max",
"cache-expiry-min": "NBViewer.cache_expiry_min",
"config-file": "NBViewer.config_file",
"content-security-policy": "NBViewer.content_security_policy",
"default-format": "NBViewer.default_format",
"frontpage": "NBViewer.frontpage",
"host": "NBViewer.host",
"ipywidgets-base-url": "NBViewer.ipywidgets_base_url",
"jupyter-js-widgets-version": "NBViewer.jupyter_js_widgets_version",
"jupyter-widgets-html-manager-version": "NBViewer.jupyter_widgets_html_manager_version",
"localfiles": "NBViewer.localfiles",
"log-level": "Application.log_level",
"mathjax-url": "NBViewer.mathjax_url",
"mc-threads": "NBViewer.mc_threads",
"port": "NBViewer.port",
"processes": "NBViewer.processes",
"provider-rewrites": "NBViewer.provider_rewrites",
"providers": "NBViewer.providers",
"proxy-host": "NBViewer.proxy_host",
"proxy-port": "NBViewer.proxy_port",
"rate-limit": "NBViewer.rate_limit",
"rate-limit-interval": "NBViewer.rate_limit_interval",
"render-timeout": "NBViewer.render_timeout",
"sslcert": "NBViewer.sslcert",
"sslkey": "NBViewer.sslkey",
"static-path": "NBViewer.static_path",
"static-url-prefix": "NBViewer.static_url_prefix",
"statsd-host": "NBViewer.statsd_host",
"statsd-port": "NBViewer.statsd_port",
"statsd-prefix": "NBViewer.statsd_prefix",
"template-path": "NBViewer.template_path",
"threads": "NBViewer.threads",
}
)
flags = Dict(
{
"debug": (
{"Application": {"log_level": logging.DEBUG}},
"Set log-level to debug, for the most verbose logging.",
),
"generate-config": (
{"NBViewer": {"generate_config": True}},
"Generate default config file.",
),
"localfile-any-user": (
{"NBViewer": {"localfile_any_user": True}},
"Also serve files that are not readable by 'Other' on the local file system.",
),
"localfile-follow-symlinks": (
{"NBViewer": {"localfile_follow_symlinks": True}},
"Resolve/follow symbolic links to their target file using realpath.",
),
"no-cache": ({"NBViewer": {"no_cache": True}}, "Do not cache results."),
"no-check-certificate": (
{"NBViewer": {"no_check_certificate": True}},
"Do not validate SSL certificates.",
),
"y": (
{"NBViewer": {"answer_yes": True}},
"Answer yes to any questions (e.g. confirm overwrite).",
),
"yes": (
{"NBViewer": {"answer_yes": True}},
"Answer yes to any questions (e.g. confirm overwrite).",
),
}
)
# Use this to insert custom configuration of handlers for NBViewer extensions
handler_settings = Dict().tag(config=True)
create_handler = Unicode(
default_value="nbviewer.handlers.CreateHandler",
help="The Tornado handler to use for creation via frontpage form.",
).tag(config=True)
custom404_handler = Unicode(
default_value="nbviewer.handlers.Custom404",
help="The Tornado handler to use for rendering 404 templates.",
).tag(config=True)
faq_handler = Unicode(
default_value="nbviewer.handlers.FAQHandler",
help="The Tornado handler to use for rendering and viewing the FAQ section.",
).tag(config=True)
gist_handler = Unicode(
default_value="nbviewer.providers.gist.handlers.GistHandler",
help="The Tornado handler to use for viewing notebooks stored as GitHub Gists",
).tag(config=True)
github_blob_handler = Unicode(
default_value="nbviewer.providers.github.handlers.GitHubBlobHandler",
help="The Tornado handler to use for viewing notebooks stored as blobs on GitHub",
).tag(config=True)
github_tree_handler = Unicode(
default_value="nbviewer.providers.github.handlers.GitHubTreeHandler",
help="The Tornado handler to use for viewing directory trees on GitHub",
).tag(config=True)
github_user_handler = Unicode(
default_value="nbviewer.providers.github.handlers.GitHubUserHandler",
help="The Tornado handler to use for viewing all of a user's repositories on GitHub.",
).tag(config=True)
index_handler = Unicode(
default_value="nbviewer.handlers.IndexHandler",
help="The Tornado handler to use for rendering the frontpage section.",
).tag(config=True)
local_handler = Unicode(
default_value="nbviewer.providers.local.handlers.LocalFileHandler",
help="The Tornado handler to use for viewing notebooks found on a local filesystem",
).tag(config=True)
url_handler = Unicode(
default_value="nbviewer.providers.url.handlers.URLHandler",
help="The Tornado handler to use for viewing notebooks accessed via URL",
).tag(config=True)
user_gists_handler = Unicode(
default_value="nbviewer.providers.gist.handlers.UserGistsHandler",
help="The Tornado handler to use for viewing directory containing all of a user's Gists",
).tag(config=True)
answer_yes = Bool(
default_value=False,
help="Answer yes to any questions (e.g. confirm overwrite).",
).tag(config=True)
# base_url specified by the user
base_url = Unicode(default_value="/", help="URL base for the server").tag(
config=True
)
binder_base_url = Unicode(
default_value="https://mybinder.org/v2",
help="URL base for binder notebook execution service.",
).tag(config=True)
cache_expiry_max = Int(
default_value=2 * 60 * 60, help="Maximum cache expiry (seconds)."
).tag(config=True)
cache_expiry_min = Int(
default_value=10 * 60, help="Minimum cache expiry (seconds)."
).tag(config=True)
client = Any().tag(config=True)
@default("client")
def _default_client(self):
client = HTTPClientClass(log=self.log)
client.cache = self.cache
return client
config_file = Unicode(
default_value="nbviewer_config.py", help="The config file to load."
).tag(config=True)
content_security_policy = Unicode(
default_value="connect-src 'none';",
help="Content-Security-Policy header setting.",
).tag(config=True)
default_format = Unicode(
default_value="html", help="Format to use for legacy / URLs."
).tag(config=True)
frontpage = Unicode(
default_value=FRONTPAGE_JSON,
help="Path to json file containing frontpage content.",
).tag(config=True)
generate_config = Bool(
default_value=False, help="Generate default config file."
).tag(config=True)
host = Unicode(help="Run on the given interface.").tag(config=True)
@default("host")
def _default_host(self):
return self.default_endpoint["host"]
index = Any().tag(config=True)
@default("index")
def _load_index(self):
if os.environ.get("NBINDEX_PORT"):
self.log.info("Indexing notebooks")
tcp_index = os.environ.get("NBINDEX_PORT")
index_url = tcp_index.split("tcp://")[1]
index_host, index_port = index_url.split(":")
else:
self.log.info("Not indexing notebooks")
indexer = NoSearch()
return indexer
ipywidgets_base_url = Unicode(
default_value="https://unpkg.com/", help="URL base for ipywidgets JS package."
).tag(config=True)
jupyter_js_widgets_version = Unicode(
default_value="*", help="Version specifier for jupyter-js-widgets JS package."
).tag(config=True)
jupyter_widgets_html_manager_version = Unicode(
default_value="*",
help="Version specifier for @jupyter-widgets/html-manager JS package.",
).tag(config=True)
localfile_any_user = Bool(
default_value=False,
help="Also serve files that are not readable by 'Other' on the local file system.",
).tag(config=True)
localfile_follow_symlinks = Bool(
default_value=False,
help="Resolve/follow symbolic links to their target file using realpath.",
).tag(config=True)
localfiles = Unicode(
default_value="",
help="Allow to serve local files under /localfile/* this can be a security risk.",
).tag(config=True)
mathjax_url = Unicode(
default_value="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.1/",
help="URL base for mathjax package.",
).tag(config=True)
# cache frontpage links for the maximum allowed time
max_cache_uris = Set().tag(config=True)
@default("max_cache_uris")
def _load_max_cache_uris(self):
max_cache_uris = {""}
for section in self.frontpage_setup["sections"]:
for link in section["links"]:
max_cache_uris.add("/" + link["target"])
return max_cache_uris
mc_threads = Int(
default_value=1, help="Number of threads to use for Async Memcache."
).tag(config=True)
no_cache = Bool(default_value=False, help="Do not cache results.").tag(config=True)
no_check_certificate = Bool(
default_value=False, help="Do not validate SSL certificates."
).tag(config=True)
port = Int(help="Run on the given port.").tag(config=True)
@default("port")
def _default_port(self):
return self.default_endpoint["port"]
processes = Int(
default_value=0, help="Use processes instead of threads for rendering."
).tag(config=True)
provider_rewrites = List(
trait=Unicode,
default_value=default_rewrites,
help="Full dotted package(s) that provide `uri_rewrites`.",
).tag(config=True)
providers = List(
trait=Unicode,
default_value=default_providers,
help="Full dotted package(s) that provide `default_handlers`.",
).tag(config=True)
proxy_host = Unicode(default_value="", help="The proxy URL.").tag(config=True)
proxy_port = Int(default_value=-1, help="The proxy port.").tag(config=True)
rate_limit = Int(
default_value=60,
help="Number of requests to allow in rate_limit_interval before limiting. Only requests that trigger a new render are counted.",
).tag(config=True)
rate_limit_interval = Int(
default_value=600, help="Interval (in seconds) for rate limiting."
).tag(config=True)
render_timeout = Int(
default_value=15,
help="Time to wait for a render to complete before showing the 'Working...' page.",
).tag(config=True)
sslcert = Unicode(help="Path to ssl .crt file.").tag(config=True)
sslkey = Unicode(help="Path to ssl .key file.").tag(config=True)
static_path = Unicode(
default_value=os.environ.get("NBVIEWER_STATIC_PATH", ""),
help="Custom path for loading additional static files.",
).tag(config=True)
static_url_prefix = Unicode(default_value="/static/").tag(config=True)
# Not exposed to end user for configuration, since needs to access base_url
_static_url_prefix = Unicode()
@default("_static_url_prefix")
def _load_static_url_prefix(self):
# Last '/' ensures that NBViewer still works regardless of whether user chooses e.g. '/static2/' or '/static2' as their custom prefix
return url_path_join(self._base_url, self.static_url_prefix, "/")
statsd_host = Unicode(
default_value="", help="Host running statsd to send metrics to."
).tag(config=True)
statsd_port = Int(
default_value=8125,
help="Port on which statsd is listening for metrics on statsd_host.",
).tag(config=True)
statsd_prefix = Unicode(
default_value="nbviewer",
help="Prefix to use for naming metrics sent to statsd.",
).tag(config=True)
template_path = Unicode(
default_value=os.environ.get("NBVIEWER_TEMPLATE_PATH", ""),
help="Custom template path for the nbviewer app (not rendered notebooks).",
).tag(config=True)
threads = Int(default_value=1, help="Number of threads to use for rendering.").tag(
config=True
)
# prefer the JupyterHub defined service prefix over the CLI
@cached_property
def _base_url(self):
return os.getenv("JUPYTERHUB_SERVICE_PREFIX", self.base_url)
@cached_property
def cache(self):
memcache_urls = os.environ.get(
"MEMCACHIER_SERVERS", os.environ.get("MEMCACHE_SERVERS")
)
# Handle linked Docker containers
if os.environ.get("NBCACHE_PORT"):
tcp_memcache = os.environ.get("NBCACHE_PORT")
memcache_urls = tcp_memcache.split("tcp://")[1]
if self.no_cache:
self.log.info("Not using cache")
cache = MockCache()
elif pylibmc and memcache_urls:
# setup memcache
mc_pool = ThreadPoolExecutor(self.mc_threads)
kwargs = dict(pool=mc_pool)
username = os.environ.get("MEMCACHIER_USERNAME", "")
password = os.environ.get("MEMCACHIER_PASSWORD", "")
if username and password:
kwargs["binary"] = True
kwargs["username"] = username
kwargs["password"] = password
self.log.info("Using SASL memcache")
else:
self.log.info("Using plain memcache")
cache = AsyncMultipartMemcache(memcache_urls.split(","), **kwargs)
else:
self.log.info("Using in-memory cache")
cache = DummyAsyncCache()
return cache
@cached_property
def default_endpoint(self):
# check if JupyterHub service options are available to use as defaults
if "JUPYTERHUB_SERVICE_URL" in os.environ:
url = urlparse(os.environ["JUPYTERHUB_SERVICE_URL"])
default_host, default_port = url.hostname, url.port
else:
default_host, default_port = "0.0.0.0", 5000
return {"host": default_host, "port": default_port}
@cached_property
def env(self):
env = Environment(loader=FileSystemLoader(self.template_paths), autoescape=True)
env.filters["markdown"] = markdown.markdown
try:
git_data = git_info(here)
except Exception as e:
self.log.error("Failed to get git info: %s", e)
git_data = {}
else:
git_data["msg"] = escape(git_data["msg"])
if self.no_cache:
# force Jinja2 to recompile template every time
env.globals.update(cache_size=0)
env.globals.update(
nrhead=nrhead,
nrfoot=nrfoot,
git_data=git_data,
jupyter_info=jupyter_info(),
len=len,
)
return env
@cached_property
def fetch_kwargs(self):
fetch_kwargs = dict(connect_timeout=10)
if self.proxy_host:
fetch_kwargs.update(proxy_host=self.proxy_host, proxy_port=self.proxy_port)
self.log.info(
"Using web proxy {proxy_host}:{proxy_port}." "".format(**fetch_kwargs)
)
if self.no_check_certificate:
fetch_kwargs.update(validate_cert=False)
self.log.info("Not validating SSL certificates")
return fetch_kwargs
@cached_property
def formats(self):
return self.configure_formats()
# load frontpage sections
@cached_property
def frontpage_setup(self):
with io.open(self.frontpage, "r") as f:
frontpage_setup = json.load(f)
# check if the JSON has a 'sections' field, otherwise assume it is just a list of sessions,
# and provide the defaults of the other fields
if "sections" not in frontpage_setup:
frontpage_setup = {
"title": "nbviewer",
"subtitle": "A simple way to share Jupyter notebooks",
"show_input": True,
"sections": frontpage_setup,
}
return frontpage_setup
# Attribute inherited from traitlets.config.Application, automatically used to style logs
# https://github.com/ipython/traitlets/blob/master/traitlets/config/application.py#L191
_log_formatter_cls = LogFormatter
# Need Tornado LogFormatter for color logs, keys 'color' and 'end_color' in log_format
# Observed traitlet inherited again from traitlets.config.Application
# https://github.com/ipython/traitlets/blob/master/traitlets/config/application.py#L177
@default("log_level")
def _log_level_default(self):
return logging.INFO
# Ditto the above: https://github.com/ipython/traitlets/blob/master/traitlets/config/application.py#L197
@default("log_format")
def _log_format_default(self):
"""override default log format to include time and color, plus to always display the log level, not just when it's high"""
return "%(color)s[%(levelname)1.1s %(asctime)s.%(msecs).03d %(name)s %(module)s:%(lineno)d]%(end_color)s %(message)s"
# For consistency with JupyterHub logs
@default("log_datefmt")
def _log_datefmt_default(self):
"""Exclude date from default date format"""
return "%Y-%m-%d %H:%M:%S"
@cached_property
def pool(self):
if self.processes:
pool = ProcessPoolExecutor(self.processes)
else:
pool = ThreadPoolExecutor(self.threads)
return pool
@cached_property
def rate_limiter(self):
rate_limiter = RateLimiter(
limit=self.rate_limit, interval=self.rate_limit_interval, cache=self.cache
)
return rate_limiter
@cached_property
def static_paths(self):
default_static_path = pjoin(here, "static")
if self.static_path:
self.log.info("Using custom static path {}".format(self.static_path))
static_paths = [self.static_path, default_static_path]
else:
static_paths = [default_static_path]
return static_paths
@cached_property
def template_paths(self):
default_template_path = pjoin(here, "templates")
if self.template_path:
self.log.info("Using custom template path {}".format(self.template_path))
template_paths = [self.template_path, default_template_path]
else:
template_paths = [default_template_path]
return template_paths
def configure_formats(self, formats=None):
"""
Format-specific configuration.
"""
if formats is None:
formats = default_formats()
# This would be better defined in a class
self.config.HTMLExporter.template_file = "basic"
self.config.SlidesExporter.template_file = "slides_reveal"
self.config.TemplateExporter.template_path = [
os.path.join(os.path.dirname(__file__), "templates", "nbconvert")
]
for key, format in formats.items():
exporter_cls = format.get("exporter", exporter_map[key])
if self.processes:
# can't pickle exporter instances,
formats[key]["exporter"] = exporter_cls
else:
formats[key]["exporter"] = exporter_cls(
config=self.config, log=self.log
)
return formats
def init_tornado_application(self):
# handle handlers
handler_names = dict(
create_handler=self.create_handler,
custom404_handler=self.custom404_handler,
faq_handler=self.faq_handler,
gist_handler=self.gist_handler,
github_blob_handler=self.github_blob_handler,
github_tree_handler=self.github_tree_handler,
github_user_handler=self.github_user_handler,
index_handler=self.index_handler,
local_handler=self.local_handler,
url_handler=self.url_handler,
user_gists_handler=self.user_gists_handler,
)
handler_kwargs = {
"handler_names": handler_names,
"handler_settings": self.handler_settings,
}
handlers = init_handlers(
self.formats,
self.providers,
self._base_url,
self.localfiles,
**handler_kwargs
)
# NBConvert config
self.config.NbconvertApp.fileext = "html"
self.config.CSSHTMLHeaderTransformer.enabled = False
# DEBUG env implies both autoreload and log-level
if os.environ.get("DEBUG"):
self.log.setLevel(logging.DEBUG)
# input traitlets to settings
settings = dict(
# Allow FileFindHandler to load static directories from e.g. a Docker container
allow_remote_access=True,
base_url=self._base_url,
binder_base_url=self.binder_base_url,
cache=self.cache,
cache_expiry_max=self.cache_expiry_max,
cache_expiry_min=self.cache_expiry_min,
client=self.client,
config=self.config,
content_security_policy=self.content_security_policy,
default_format=self.default_format,
fetch_kwargs=self.fetch_kwargs,
formats=self.formats,
frontpage_setup=self.frontpage_setup,
google_analytics_id=os.getenv("GOOGLE_ANALYTICS_ID"),
gzip=True,
hub_api_token=os.getenv("JUPYTERHUB_API_TOKEN"),
hub_api_url=os.getenv("JUPYTERHUB_API_URL"),
hub_base_url=os.getenv("JUPYTERHUB_BASE_URL"),
index=self.index,
ipywidgets_base_url=self.ipywidgets_base_url,
jinja2_env=self.env,
jupyter_js_widgets_version=self.jupyter_js_widgets_version,
jupyter_widgets_html_manager_version=self.jupyter_widgets_html_manager_version,
localfile_any_user=self.localfile_any_user,
localfile_follow_symlinks=self.localfile_follow_symlinks,
localfile_path=os.path.abspath(self.localfiles),
log=self.log,
log_function=log_request,
mathjax_url=self.mathjax_url,
max_cache_uris=self.max_cache_uris,
pool=self.pool,
provider_rewrites=self.provider_rewrites,
providers=self.providers,
rate_limiter=self.rate_limiter,
render_timeout=self.render_timeout,
static_handler_class=StaticFileHandler,
# FileFindHandler expects list of static paths, so self.static_path*s* is correct
static_path=self.static_paths,
static_url_prefix=self._static_url_prefix,
statsd_host=self.statsd_host,
statsd_port=self.statsd_port,
statsd_prefix=self.statsd_prefix,
)
if self.localfiles:
self.log.warning(
"Serving local notebooks in %s, this can be a security risk",
self.localfiles,
)
# create the app
self.tornado_application = web.Application(handlers, **settings)
def init_logging(self):
# Note that we inherit a self.log attribute from traitlets.config.Application
# https://github.com/ipython/traitlets/blob/master/traitlets/config/application.py#L209
# as well as a log_level attribute
# https://github.com/ipython/traitlets/blob/master/traitlets/config/application.py#L177
# This prevents double log messages because tornado use a root logger that
# self.log is a child of. The logging module dispatches log messages to a log
# and all of its ancestors until propagate is set to False.
self.log.propagate = False
tornado_log = logging.getLogger("tornado")
# hook up tornado's loggers to our app handlers
for log in (app_log, access_log, tornado_log, curl_log):
# ensure all log statements identify the application they come from
log.name = self.log.name
log.parent = self.log
log.propagate = True
log.setLevel(self.log_level)
# disable curl debug, which logs all headers, info for upstream requests, which is TOO MUCH
curl_log.setLevel(max(self.log_level, logging.INFO))
# Mostly copied from JupyterHub because if it isn't broken then don't fix it.
def write_config_file(self):
"""Write our default config to a .py config file"""
config_file_dir = os.path.dirname(os.path.abspath(self.config_file))
if not os.path.isdir(config_file_dir):
self.exit(
"{} does not exist. The destination directory must exist before generating config file.".format(
config_file_dir
)
)
if os.path.exists(self.config_file) and not self.answer_yes:
answer = ""
def ask():
prompt = "Overwrite %s with default config? [y/N]" % self.config_file
try:
return input(prompt).lower() or "n"
except KeyboardInterrupt:
print("") # empty line
return "n"
answer = ask()
while not answer.startswith(("y", "n")):
print("Please answer 'yes' or 'no'")
answer = ask()
if answer.startswith("n"):
self.exit("Not overwriting config file with default.")
# Inherited method from traitlets.config.Application
config_text = self.generate_config_file()
if isinstance(config_text, bytes):
config_text = config_text.decode("utf8")
print("Writing default config to: %s" % self.config_file)
with open(self.config_file, mode="w") as f:
f.write(config_text)
self.exit("Wrote default config file.")
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
# parse command line with catch_config_error from traitlets.config.Application
super().initialize(*args, **kwargs)
if self.generate_config:
self.write_config_file()
# Inherited method from traitlets.config.Application
self.load_config_file(self.config_file)
self.init_logging()
self.init_tornado_application()
def main(argv=None):
# create and start the app
nbviewer = NBViewer()
app = nbviewer.tornado_application
# load ssl options
ssl_options = None
if nbviewer.sslcert:
ssl_options = {"certfile": nbviewer.sslcert, "keyfile": nbviewer.sslkey}
http_server = httpserver.HTTPServer(app, xheaders=True, ssl_options=ssl_options)
nbviewer.log.info(
"Listening on %s:%i, path %s",
nbviewer.host,
nbviewer.port,
app.settings["base_url"],
)
http_server.listen(nbviewer.port, nbviewer.host)
ioloop.IOLoop.current().start()
if __name__ == "__main__":
main()

View File

@ -0,0 +1,198 @@
# -----------------------------------------------------------------------------
# Copyright (C) Jupyter Development Team
#
# Distributed under the terms of the BSD License. The full license is in
# the file COPYING, distributed as part of this software.
# -----------------------------------------------------------------------------
import asyncio
import zlib
from asyncio import Future
from concurrent.futures import ThreadPoolExecutor
from time import monotonic
from tornado.log import app_log
try:
import pylibmc
except ImportError:
pylibmc = None
# -----------------------------------------------------------------------------
# Code
# -----------------------------------------------------------------------------
class MockCache(object):
"""Mock Cache. Just stores nothing and always return None on get."""
def __init__(self, *args, **kwargs):
pass
async def get(self, key):
f = Future()
f.set_result(None)
return await f
async def set(self, key, value, *args, **kwargs):
f = Future()
f.set_result(None)
return await f
async def add(self, key, value, *args, **kwargs):
f = Future()
f.set_result(True)
return await f
async def incr(self, key):
f = Future()
f.set_result(None)
return await f
class DummyAsyncCache(object):
"""Dummy Async Cache. Just stores things in a dict of fixed size."""
def __init__(self, limit=10):
self._cache = {}
self._cache_order = []
self.limit = limit
async def get(self, key):
f = Future()
f.set_result(self._get(key))
return await f
def _get(self, key):
value, deadline = self._cache.get(key, (None, None))
if deadline and deadline < monotonic():
self._cache.pop(key)
self._cache_order.remove(key)
else:
return value
async def set(self, key, value, expires=0):
if key in self._cache and self._cache_order[-1] != key:
idx = self._cache_order.index(key)
del self._cache_order[idx]
self._cache_order.append(key)
else:
if len(self._cache) >= self.limit:
oldest = self._cache_order.pop(0)
self._cache.pop(oldest)
self._cache_order.append(key)
if not expires:
deadline = None
else:
deadline = monotonic() + expires
self._cache[key] = (value, deadline)
f = Future()
f.set_result(True)
return await f
async def add(self, key, value, expires=0):
f = Future()
if self._get(key) is not None:
f.set_result(False)
else:
await self.set(key, value, expires)
f.set_result(True)
return await f
async def incr(self, key):
f = Future()
if self._get(key) is not None:
value, deadline = self._cache[key]
value = value + 1
self._cache[key] = (value, deadline)
else:
value = None
f.set_result(value)
return await f
class AsyncMemcache(object):
"""Wrap pylibmc.Client to run in a background thread
via concurrent.futures.ThreadPoolExecutor
"""
def __init__(self, *args, **kwargs):
self.pool = kwargs.pop("pool", None) or ThreadPoolExecutor(1)
self.mc = pylibmc.Client(*args, **kwargs)
self.mc_pool = pylibmc.ThreadMappedPool(self.mc)
self.loop = asyncio.get_event_loop()
async def _call_in_thread(self, method_name, *args, **kwargs):
# https://stackoverflow.com/questions/34376814/await-future-from-executor-future-cant-be-used-in-await-expression
key = args[0]
if "multi" in method_name:
key = sorted(key)[0].decode("ascii") + "[%i]" % len(key)
app_log.debug("memcache submit %s %s", method_name, key)
def f():
app_log.debug("memcache %s %s", method_name, key)
with self.mc_pool.reserve() as mc:
meth = getattr(mc, method_name)
return meth(*args, **kwargs)
return await self.loop.run_in_executor(self.pool, f)
async def get(self, *args, **kwargs):
return await self._call_in_thread("get", *args, **kwargs)
async def set(self, *args, **kwargs):
return await self._call_in_thread("set", *args, **kwargs)
async def add(self, *args, **kwargs):
return await self._call_in_thread("add", *args, **kwargs)
async def incr(self, *args, **kwargs):
return await self._call_in_thread("incr", *args, **kwargs)
class AsyncMultipartMemcache(AsyncMemcache):
"""subclass of AsyncMemcache that splits large files into multiple chunks
because memcached limits record size to 1MB
"""
def __init__(self, *args, **kwargs):
self.chunk_size = kwargs.pop("chunk_size", 950000)
self.max_chunks = kwargs.pop("max_chunks", 16)
super().__init__(*args, **kwargs)
async def get(self, key, *args, **kwargs):
keys = [("%s.%i" % (key, idx)).encode() for idx in range(self.max_chunks)]
values = await self._call_in_thread("get_multi", keys, *args, **kwargs)
parts = []
for key in keys:
if key not in values:
break
parts.append(values[key])
if parts:
compressed = b"".join(parts)
try:
result = zlib.decompress(compressed)
except zlib.error as e:
app_log.error("zlib decompression of %s failed: %s", key, e)
else:
return result
async def set(self, key, value, *args, **kwargs):
chunk_size = self.chunk_size
compressed = zlib.compress(value)
offsets = range(0, len(compressed), chunk_size)
app_log.debug("storing %s in %i chunks", key, len(offsets))
if len(offsets) > self.max_chunks:
raise ValueError("file is too large: %sB" % len(compressed))
values = {}
for idx, offset in enumerate(offsets):
values[("%s.%i" % (key, idx)).encode()] = compressed[
offset : offset + chunk_size
]
return await self._call_in_thread("set_multi", values, *args, **kwargs)

View File

@ -0,0 +1,121 @@
"""Async HTTP client with bonus features!
- Support caching via upstream 304 with ETag, Last-Modified
- Log request timings for profiling
"""
# Copyright (c) Jupyter Development Team.
# Distributed under the terms of the Modified BSD License.
import asyncio
import hashlib
import pickle
import time
from tornado.curl_httpclient import CurlAsyncHTTPClient
from tornado.httpclient import HTTPRequest
from nbviewer.utils import time_block
# -----------------------------------------------------------------------------
# Async HTTP Client
# -----------------------------------------------------------------------------
# cache headers and their response:request mapping
# use this to map headers in cached response to the headers
# that should be set in the request.
cache_headers = {"ETag": "If-None-Match", "Last-Modified": "If-Modified-Since"}
class NBViewerAsyncHTTPClient(object):
"""Subclass of AsyncHTTPClient with bonus logging and caching!
If upstream servers support 304 cache replies with the following headers:
- ETag : If-None-Match
- Last-Modified : If-Modified-Since
Upstream requests are still made every time,
but resources and rate limits may be saved by 304 responses.
If upstream responds with 304 or an error and a cached response is available,
use the cached response.
Responses are cached as long as possible.
"""
cache = None
def __init__(self, log, client=None):
self.log = log
self.client = client or CurlAsyncHTTPClient()
def fetch(self, url, params=None, **kwargs):
request = HTTPRequest(url, **kwargs)
if request.user_agent is None:
request.user_agent = "Tornado-Async-Client"
# The future which will become the response upon awaiting.
response_future = asyncio.ensure_future(self.smart_fetch(request))
return response_future
async def smart_fetch(self, request):
"""
Before fetching request, first look to see whether it's already in cache.
If so load the response from cache. Only otherwise attempt to fetch the request.
When response code isn't 304 or 400, cache response before loading, else just load.
"""
tic = time.time()
# when logging, use the URL without params
name = request.url.split("?")[0]
self.log.debug("Fetching %s", name)
# look for a cached response
cached_response = None
cache_key = hashlib.sha256(request.url.encode("utf8")).hexdigest()
cached_response = await self._get_cached_response(cache_key, name)
toc = time.time()
self.log.info("Upstream cache get %s %.2f ms", name, 1e3 * (toc - tic))
if cached_response:
self.log.info("Upstream cache hit %s", name)
# add cache headers, if any
for resp_key, req_key in cache_headers.items():
value = cached_response.headers.get(resp_key)
if value:
request.headers[req_key] = value
return cached_response
else:
self.log.info("Upstream cache miss %s", name)
response = await self.client.fetch(request)
dt = time.time() - tic
self.log.info("Fetched %s in %.2f ms", name, 1e3 * dt)
await self._cache_response(cache_key, name, response)
return response
async def _get_cached_response(self, cache_key, name):
"""Get the cached response, if any"""
if not self.cache:
return
try:
cached_pickle = await self.cache.get(cache_key)
if cached_pickle:
self.log.info("Type of self.cache is: %s", type(self.cache))
return pickle.loads(cached_pickle)
except Exception:
self.log.error("Upstream cache get failed %s", name, exc_info=True)
async def _cache_response(self, cache_key, name, response):
"""Cache the response, if any cache headers we understand are present."""
if not self.cache:
return
with time_block("Upstream cache set %s" % name, logger=self.log):
# cache the response
try:
pickle_response = pickle.dumps(response, pickle.HIGHEST_PROTOCOL)
await self.cache.set(cache_key, pickle_response)
except Exception:
self.log.error("Upstream cache failed %s" % name, exc_info=True)

View File

@ -0,0 +1,70 @@
# -----------------------------------------------------------------------------
# Copyright (C) Jupyter Development Team
#
# Distributed under the terms of the BSD License. The full license is in
# the file COPYING, distributed as part of this software.
# -----------------------------------------------------------------------------
def default_formats():
"""
Return the currently-implemented formats.
These are not classes, but maybe should be: would they survive pickling?
- exporter:
an Exporter subclass.
if not defined, and key is in nbconvert.export.exporter_map, it will be added
automatically
- nbconvert_template:
the name of the nbconvert template to add to config.ExporterClass
- test:
a function(notebook_object, notebook_json)
conditionally offer a format based on content if truthy. see
`RenderingHandler.filter_exporters`
- postprocess:
a function(html, resources)
perform any modifications to html and resources after nbconvert
- content_Type:
a string specifying the Content-Type of the response from this format.
Defaults to text/html; charset=UTF-8
"""
def test_slides(nb, json):
"""Determines if at least one cell has a non-blank or "-" as its
metadata.slideshow.slide_type value.
Parameters
----------
nb: nbformat.notebooknode.NotebookNode
Top of the parsed notebook object model
json: str
JSON source of the notebook, unused
Returns
-------
bool
"""
for cell in nb.cells:
if (
"metadata" in cell
and "slideshow" in cell.metadata
and cell.metadata.slideshow.get("slide_type", "-") != "-"
):
return True
return False
return {
"html": {"nbconvert_template": "basic", "label": "Notebook", "icon": "book"},
"slides": {
"nbconvert_template": "slides_reveal",
"label": "Slides",
"icon": "gift",
"test": test_slides,
},
"script": {
"label": "Code",
"icon": "code",
"content_type": "text/plain; charset=UTF-8",
},
}

View File

@ -0,0 +1,92 @@
{"title": "nbviewer",
"subtitle": "A simple way to share Jupyter Notebooks",
"text": "Enter the location of a Jupyter Notebook to have it rendered here:",
"show_input": true,
"sections":[
{
"header":"Programming Languages",
"links":[
{
"text": "IPython",
"target": "/github/ipython/ipython/blob/6.x/examples/IPython%20Kernel/Index.ipynb",
"img": "/img/example-nb/ipython-thumb.png"
},
{
"text": "IRuby",
"target": "/github/SciRuby/sciruby-notebooks/blob/master/getting_started.ipynb",
"img": "/img/example-nb/iruby-nb.png"
},
{
"text": "IJulia",
"target": "/url/jdj.mit.edu/~stevenj/IJulia%20Preview.ipynb",
"img": "/img/example-nb/ijulia-preview.png"
}
]
},
{
"header":"Books",
"links":[
{
"text": "Python for Signal Processing",
"target": "/github/unpingco/Python-for-Signal-Processing/",
"img": "/img/example-nb/python-signal.png"
},
{
"text": "O'Reilly Book",
"target": "/github/ptwobrussell/Mining-the-Social-Web-2nd-Edition/tree/master/ipynb",
"img": "/img/example-nb/mining-slice.png"
},
{
"text": "Probabilistic Programming",
"target": "/github/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/blob/master/Chapter1_Introduction/Ch1_Introduction_PyMC3.ipynb",
"img": "/img/example-nb/probabilistic-bayesian.png"
}
]
},
{
"header":"Misc",
"links":[
{
"text": "Data Visualization with Lightning",
"target": "/github/lightning-viz/lightning-example-notebooks/blob/master/index.ipynb",
"img": "/img/example-nb/lightning.png"
},
{
"text": "Interactive plots with Plotly",
"target": "/github/plotly/python-user-guide/blob/master/Index.ipynb",
"img": "/img/example-nb/plotly.png"
},
{
"text": "XKCD Plot With Matplotlib",
"target": "/url/jakevdp.github.com/downloads/notebooks/XKCD_plots.ipynb",
"img": "/img/example-nb/XKCD-Matplotlib.png"
},
{
"text": "Python for Vision Research",
"target": "/github/gestaltrevision/python_for_visres/blob/master/index.ipynb",
"img": "/img/example-nb/python_for_visres.png"
},
{
"text": "Non Parametric Regression",
"target": "/gist/fonnesbeck/2352771",
"img": "/img/example-nb/covariance.png"
},
{
"text": "Partial Differential Equations Solver",
"target": "/github/waltherg/notebooks/blob/master/2013-12-03-Crank_Nicolson.ipynb",
"img": "/img/example-nb/pde_solver_with_numpy.png"
},
{
"text": "Analysis of current events",
"target": "/gist/darribas/4121857",
"img": "/img/example-nb/gaza.png"
},
{
"text": "Jaynes-Cummings model",
"target": "/github/jrjohansson/qutip-lectures/blob/master/Lecture-1-Jaynes-Cumming-model.ipynb",
"img": "/img/example-nb/jaynes-cummings.png"
}
]
}
]
}

View File

@ -0,0 +1,33 @@
{"type": "object",
"properties": {
"title": {"type": "string"},
"subtitle": {"type": "string"},
"text": {"type": "string"},
"show_input": {"type": "boolean"},
"sections" : {
"type": "array",
"items": {
"type": "object",
"properties": {
"header": {"type": "string"},
"links": {
"type": "array",
"items": {
"type": "object",
"properties": {
"text": {"type": "string"},
"target": {"type": "string"},
"img": {"type": "string"}
},
"required": ["text", "target", "img"]
}
}
},
"required": [
"header",
"links"
]
}
}
}
}

View File

@ -0,0 +1,159 @@
# -----------------------------------------------------------------------------
# Copyright (C) Jupyter Development Team
#
# Distributed under the terms of the BSD License. The full license is in
# the file COPYING, distributed as part of this software.
# -----------------------------------------------------------------------------
from tornado import web
from .providers import _load_handler_from_location
from .providers import provider_handlers
from .providers import provider_uri_rewrites
from .providers.base import BaseHandler
from .providers.base import format_prefix
from .utils import transform_ipynb_uri
from .utils import url_path_join
# -----------------------------------------------------------------------------
# Handler classes
# -----------------------------------------------------------------------------
class Custom404(BaseHandler):
"""Render our 404 template"""
def prepare(self):
# skip parent prepare() step, just render the 404
raise web.HTTPError(404)
class IndexHandler(BaseHandler):
"""Render the index"""
def render_index_template(self, **namespace):
return self.render_template(
"index.html",
title=self.frontpage_setup.get("title", None),
subtitle=self.frontpage_setup.get("subtitle", None),
text=self.frontpage_setup.get("text", None),
show_input=self.frontpage_setup.get("show_input", True),
sections=self.frontpage_setup.get("sections", []),
**namespace
)
def get(self):
self.finish(self.render_index_template())
class FAQHandler(BaseHandler):
"""Render the markdown FAQ page"""
def get(self):
self.finish(self.render_template("faq.md"))
class CreateHandler(BaseHandler):
"""handle creation via frontpage form
only redirects to the appropriate URL
"""
uri_rewrite_list = None
def post(self):
value = self.get_argument("gistnorurl", "")
redirect_url = transform_ipynb_uri(value, self.get_provider_rewrites())
self.log.info("create %s => %s", value, redirect_url)
self.redirect(url_path_join(self.base_url, redirect_url))
def get_provider_rewrites(self):
# storing this on a class attribute is a little icky, but is better
# than the global this was refactored from.
if self.uri_rewrite_list is None:
# providers is a list of module import paths
providers = self.settings["provider_rewrites"]
type(self).uri_rewrite_list = provider_uri_rewrites(providers)
return self.uri_rewrite_list
# -----------------------------------------------------------------------------
# Default handler URL mapping
# -----------------------------------------------------------------------------
def format_handlers(formats, urlspecs, **handler_settings):
"""
Tornado handler URLSpec of form (route, handler_class, initalize_kwargs)
https://www.tornadoweb.org/en/stable/web.html#tornado.web.URLSpec
kwargs passed to initialize are None by default but can be added
https://www.tornadoweb.org/en/stable/web.html#tornado.web.RequestHandler.initialize
"""
urlspecs = [
(prefix + url, handler, {"format": format, "format_prefix": prefix})
for format in formats
for url, handler, initialize_kwargs in urlspecs
for prefix in [format_prefix + format]
]
for handler_setting in handler_settings:
if handler_settings[handler_setting]:
# here we modify the URLSpec dict to have the key-value pairs from
# handler_settings in NBViewer.init_tornado_application
for urlspec in urlspecs:
urlspec[2][handler_setting] = handler_settings[handler_setting]
return urlspecs
def init_handlers(formats, providers, base_url, localfiles, **handler_kwargs):
"""
`handler_kwargs` is a dict of dicts: first dict is `handler_names`, which
specifies the handler_classes to load for the providers, the second
is `handler_settings` (see comments in format_handlers)
Only `handler_settings` should get added to the initialize_kwargs in the
handler URLSpecs, which is why we pass only it to `format_handlers`
but both it and `handler_names` to `provider_handlers`
"""
handler_settings = handler_kwargs["handler_settings"]
handler_names = handler_kwargs["handler_names"]
create_handler = _load_handler_from_location(handler_names["create_handler"])
custom404_handler = _load_handler_from_location(handler_names["custom404_handler"])
faq_handler = _load_handler_from_location(handler_names["faq_handler"])
index_handler = _load_handler_from_location(handler_names["index_handler"])
# If requested endpoint matches multiple routes, it only gets handled by handler
# corresponding to the first matching route. So order of URLSpecs in this list matters.
pre_providers = [
("/?", index_handler, {}),
("/index.html", index_handler, {}),
(r"/faq/?", faq_handler, {}),
(r"/create/?", create_handler, {}),
# don't let super old browsers request data-uris
(r".*/data:.*;base64,.*", custom404_handler, {}),
]
post_providers = [(r"/(robots\.txt|favicon\.ico)", web.StaticFileHandler, {})]
# Add localfile handlers if the option is set
if localfiles:
# Put local provider first as per the comment at
# https://github.com/jupyter/nbviewer/pull/727#discussion_r144448440.
providers.insert(0, "nbviewer.providers.local")
handlers = provider_handlers(providers, **handler_kwargs)
raw_handlers = (
pre_providers
+ handlers
+ format_handlers(formats, handlers, **handler_settings)
+ post_providers
)
new_handlers = []
for handler in raw_handlers:
pattern = url_path_join(base_url, handler[0])
new_handler = tuple([pattern] + list(handler[1:]))
new_handlers.append(new_handler)
new_handlers.append((r".*", custom404_handler, {}))
return new_handlers

View File

@ -0,0 +1,54 @@
# -----------------------------------------------------------------------------
# Copyright (C) Jupyter Development Team
#
# Distributed under the terms of the BSD License. The full license is in
# the file COPYING, distributed as part of this software.
# -----------------------------------------------------------------------------
"""
Classes for Indexing Notebooks
"""
import uuid
from tornado.log import app_log
class Indexer(object):
def index_notebook(self, notebook_url, notebook_contents):
raise NotImplementedError("index_notebook not implemented")
class NoSearch(Indexer):
def __init__(self):
pass
def index_notebook(self, notebook_url, notebook_contents, *args, **kwargs):
app_log.debug('Totally not indexing "{}"'.format(notebook_url))
class ElasticSearch(Indexer):
def __init__(self, host="127.0.0.1", port=9200):
from elasticsearch import Elasticsearch
self.elasticsearch = Elasticsearch([{"host": host, "port": port}])
def index_notebook(self, notebook_url, notebook_contents, public=False):
notebook_url = notebook_url.encode("utf-8")
notebook_id = uuid.uuid5(uuid.NAMESPACE_URL, notebook_url)
# Notebooks API Model
# https://github.com/ipython/ipython/wiki/IPEP-16%3A-Notebook-multi-directory-dashboard-and-URL-mapping#notebooks-api
body = {"content": notebook_contents, "public": public}
resp = self.elasticsearch.index(
index="notebooks", doc_type="ipynb", body=body, id=notebook_id.hex
)
if resp["created"]:
app_log.info(
"Created new indexed notebook={}, public={}".format(
notebook_url, public
)
)
else:
app_log.info(
"Indexing old notebook={}, public={}".format(notebook_url, public)
)

View File

@ -0,0 +1,57 @@
# -----------------------------------------------------------------------------
# Copyright (C) Jupyter Development Team
#
# Distributed under the terms of the BSD License. The full license is in
# the file COPYING, distributed as part of this software.
# -----------------------------------------------------------------------------
import json
from tornado.log import access_log
from tornado.web import StaticFileHandler
def log_request(handler):
"""log a bit more information about each request than tornado's default
- move static file get success to debug-level (reduces noise)
- get proxied IP instead of proxy IP
- log referer for redirect and failed requests
- log user-agent for failed requests
"""
status = handler.get_status()
request = handler.request
if (
status == 304
or (status < 300 and isinstance(handler, StaticFileHandler))
or (status < 300 and request.uri == "/")
):
# static-file successes or any 304 FOUND are debug-level
log_method = access_log.debug
elif status < 400:
log_method = access_log.info
elif status < 500:
log_method = access_log.warning
else:
log_method = access_log.error
request_time = 1000.0 * handler.request.request_time()
ns = dict(
status=status,
method=request.method,
ip=request.remote_ip,
uri=request.uri,
request_time=request_time,
)
msg = "{status} {method} {uri} ({ip}) {request_time:.2f}ms"
if status >= 300:
# log referers on redirects
ns["referer"] = request.headers.get("Referer", "None")
msg = msg + ' referer="{referer}"'
if status >= 400:
# log user agent for failed requests
ns["agent"] = request.headers.get("User-Agent", "Unknown")
msg = msg + ' user-agent="{agent}"'
if status >= 500 and status not in {502, 503}:
# log all headers if it caused an error
log_method(json.dumps(dict(request.headers), indent=2))
log_method(msg.format(**ns))

View File

@ -0,0 +1,115 @@
# -----------------------------------------------------------------------------
# Copyright (C) Jupyter Development Team
#
# Distributed under the terms of the BSD License. The full license is in
# the file COPYING, distributed as part of this software.
# -----------------------------------------------------------------------------
default_providers = [
"nbviewer.providers.{}".format(prov) for prov in ["url", "github", "gist"]
]
default_rewrites = [
"nbviewer.providers.{}".format(prov)
for prov in ["gist", "github", "dropbox", "url"]
]
def provider_handlers(providers, **handler_kwargs):
"""Load tornado URL handlers from an ordered list of dotted-notation modules
which contain a `default_handlers` function
`default_handlers` should accept a list of handlers and returns an
augmented list of handlers: this allows the addition of, for
example, custom URLs which should be intercepted before being
handed to the basic `url` handler
`handler_kwargs` is a dict of dicts: first dict is `handler_names`, which
specifies the handler_classes to load for the providers, the second
is `handler_settings` (see comments in `format_handlers` in nbviewer/handlers.py)
"""
handler_names = handler_kwargs["handler_names"]
handler_settings = handler_kwargs["handler_settings"]
urlspecs = _load_provider_feature("default_handlers", providers, **handler_names)
for handler_setting in handler_settings:
if handler_settings[handler_setting]:
# here we modify the URLSpec dict to have the key-value pairs from
# handler_settings in NBViewer.init_tornado_application
# kwargs passed to initialize are None by default but can be added
# https://www.tornadoweb.org/en/stable/web.html#tornado.web.RequestHandler.initialize
for urlspec in urlspecs:
urlspec[2][handler_setting] = handler_settings[handler_setting]
return urlspecs
def provider_uri_rewrites(providers):
"""Load (regex, template) tuples from an ordered list of dotted-notation
modules which contain a `uri_rewrites` function
`uri_rewrites` should accept a list of rewrites and returns an
augmented list of rewrites: this allows the addition of, for
example, the greedy behavior of the `gist` and `github` providers
"""
return _load_provider_feature("uri_rewrites", providers)
def _load_provider_feature(feature, providers, **handler_names):
"""Load the named feature from an ordered list of dotted-notation modules
which each implements the feature.
The feature will be passed a list of feature implementations and must
return that list, suitably modified.
`handler_names` is the same as the `handler_names` attribute of the NBViewer class
"""
# Ex: provider = 'nbviewer.providers.url'
# provider.rsplit(',', 1) = ['nbviewer.providers', 'url']
# provider_type = 'url'
provider_types = [provider.rsplit(".", 1)[-1] for provider in providers]
if "github" in provider_types:
provider_types.append("github_blob")
provider_types.append("github_tree")
provider_types.remove("github")
provider_handlers = {}
# Ex: provider_type = 'url'
for provider_type in provider_types:
# Ex: provider_handler_key = 'url_handler'
provider_handler_key = provider_type + "_handler"
try:
# Ex: handler_names['url_handler']
handler_names[provider_handler_key]
except KeyError as e:
continue
else:
# Ex: provider_handlers['url_handler'] = handler_names['url_handler']
provider_handlers[provider_handler_key] = handler_names[
provider_handler_key
]
features = []
# Ex: provider = 'nbviewer.providers.url'
for provider in providers:
# Ex: module = __import__('nbviewer.providers.url', fromlist=['default_handlers'])
module = __import__(provider, fromlist=[feature])
# Ex: getattr(module, 'default_handlers') = the `default_handlers` function from
# nbviewer.providers.url (in handlers.py of nbviewer/providers/url)
# so in example, features = nbviewer.providers.url.default_handlers(list_of_already_loaded_handlers, **handler_names)
# => features = list_of_already_loaded_handlers + [URLSpec of chosen URL handler]
features = getattr(module, feature)(features, **handler_names)
return features
def _load_handler_from_location(handler_location):
# Ex: handler_location = 'nbviewer.providers.url.URLHandler'
# module_name = 'nbviewer.providers.url', handler_name = 'URLHandler'
module_name, handler_name = tuple(handler_location.rsplit(".", 1))
module = __import__(module_name, fromlist=[handler_name])
handler = getattr(module, handler_name)
return handler

View File

@ -0,0 +1,788 @@
# -----------------------------------------------------------------------------
# Copyright (C) Jupyter Development Team
#
# Distributed under the terms of the BSD License. The full license is in
# the file COPYING, distributed as part of this software.
# -----------------------------------------------------------------------------
import asyncio
import hashlib
import pickle
import socket
import time
from contextlib import contextmanager
from datetime import datetime
from html import escape
from http.client import responses
from urllib.parse import quote
from urllib.parse import urlencode
from urllib.parse import urlparse
from urllib.parse import urlunparse
import statsd
from nbformat import current_nbformat
from nbformat import reads
from tornado import httpclient
from tornado import web
from tornado.concurrent import Future
from tornado.escape import url_escape
from tornado.escape import url_unescape
from tornado.escape import utf8
from tornado.ioloop import IOLoop
from ..render import NbFormatError
from ..render import render_notebook
from ..utils import EmptyClass
from ..utils import parse_header_links
from ..utils import time_block
from ..utils import url_path_join
try:
import pycurl
from tornado.curl_httpclient import CurlError
except ImportError:
pycurl = None
class CurlError(Exception):
pass
format_prefix = "/format/"
class BaseHandler(web.RequestHandler):
"""Base Handler class with common utilities"""
def initialize(self, format=None, format_prefix="", **handler_settings):
# format: str, optional
# Rendering format (e.g. script, slides, html)
self.format = format or self.default_format
self.format_prefix = format_prefix
self.http_client = httpclient.AsyncHTTPClient()
self.date_fmt = "%a, %d %b %Y %H:%M:%S UTC"
for handler_setting in handler_settings:
setattr(self, handler_setting, handler_settings[handler_setting])
# Overloaded methods
def redirect(self, url, *args, **kwargs):
purl = urlparse(url)
eurl = urlunparse(
(
purl.scheme,
purl.netloc,
"/".join(
[
url_escape(url_unescape(p), plus=False)
for p in purl.path.split("/")
]
),
purl.params,
purl.query,
purl.fragment,
)
)
return super().redirect(eurl, *args, **kwargs)
def set_default_headers(self):
self.add_header("Content-Security-Policy", self.content_security_policy)
async def prepare(self):
"""Check if the user is authenticated with JupyterHub if the hub
API endpoint and token are configured.
Redirect unauthenticated requests to the JupyterHub login page.
Do nothing if not running as a JupyterHub service.
"""
# if any of these are set, assume we want to do auth, even if
# we're misconfigured (better safe than sorry!)
if self.hub_api_url or self.hub_api_token or self.hub_base_url:
def redirect_to_login():
self.redirect(
url_path_join(self.hub_base_url, "/hub/login")
+ "?"
+ urlencode({"next": self.request.path})
)
encrypted_cookie = self.get_cookie(self.hub_cookie_name)
if not encrypted_cookie:
# no cookie == not authenticated
return redirect_to_login()
try:
# if the hub returns a success code, the user is known
await self.http_client.fetch(
url_path_join(
self.hub_api_url,
"authorizations/cookie",
self.hub_cookie_name,
quote(encrypted_cookie, safe=""),
),
headers={"Authorization": "token " + self.hub_api_token},
)
except httpclient.HTTPError as ex:
if ex.response.code == 404:
# hub does not recognize the cookie == not authenticated
return redirect_to_login()
# let all other errors surface: they're unexpected
raise ex
# Properties
@property
def base_url(self):
return self.settings["base_url"]
@property
def binder_base_url(self):
return self.settings["binder_base_url"]
@property
def cache(self):
return self.settings["cache"]
@property
def cache_expiry_max(self):
return self.settings.setdefault("cache_expiry_max", 120)
@property
def cache_expiry_min(self):
return self.settings.setdefault("cache_expiry_min", 60)
@property
def client(self):
return self.settings["client"]
@property
def config(self):
return self.settings["config"]
@property
def content_security_policy(self):
return self.settings["content_security_policy"]
@property
def default_format(self):
return self.settings["default_format"]
@property
def formats(self):
return self.settings["formats"]
@property
def frontpage_setup(self):
return self.settings["frontpage_setup"]
@property
def hub_api_token(self):
return self.settings.get("hub_api_token")
@property
def hub_api_url(self):
return self.settings.get("hub_api_url")
@property
def hub_base_url(self):
return self.settings["hub_base_url"]
@property
def hub_cookie_name(self):
return "jupyterhub-services"
@property
def index(self):
return self.settings["index"]
@property
def ipywidgets_base_url(self):
return self.settings["ipywidgets_base_url"]
@property
def jupyter_js_widgets_version(self):
return self.settings["jupyter_js_widgets_version"]
@property
def jupyter_widgets_html_manager_version(self):
return self.settings["jupyter_widgets_html_manager_version"]
@property
def mathjax_url(self):
return self.settings["mathjax_url"]
@property
def log(self):
return self.settings["log"]
@property
def max_cache_uris(self):
return self.settings.setdefault("max_cache_uris", set())
@property
def pending(self):
return self.settings.setdefault("pending", {})
@property
def pool(self):
return self.settings["pool"]
@property
def providers(self):
return self.settings["providers"]
@property
def rate_limiter(self):
return self.settings["rate_limiter"]
@property
def static_url_prefix(self):
return self.settings["static_url_prefix"]
@property
def statsd(self):
if hasattr(self, "_statsd"):
return self._statsd
if self.settings["statsd_host"]:
self._statsd = statsd.StatsClient(
self.settings["statsd_host"],
self.settings["statsd_port"],
self.settings["statsd_prefix"] + "." + type(self).__name__,
)
return self._statsd
else:
# return an empty mock object!
self._statsd = EmptyClass()
return self._statsd
# ---------------------------------------------------------------
# template rendering
# ---------------------------------------------------------------
def from_base(self, url, *args):
if not url.startswith("/") or url.startswith(self.base_url):
return url_path_join(url, *args)
return url_path_join(self.base_url, url, *args)
def get_template(self, name):
"""Return the jinja template object for a given name"""
return self.settings["jinja2_env"].get_template(name)
def render_template(self, name, **namespace):
namespace.update(self.template_namespace)
template = self.get_template(name)
return template.render(**namespace)
# Wrappers to facilitate custom rendering in subclasses without having to rewrite entire GET methods
# This would seem to mostly involve creating different template namespaces to enable custom logic in
# extended templates, but there might be other possibilities
def render_status_code_template(self, status_code, **namespace):
return self.render_template("%d.html" % status_code, **namespace)
def render_error_template(self, **namespace):
return self.render_template("error.html", **namespace)
@property
def template_namespace(self):
return {
"mathjax_url": self.mathjax_url,
"static_url": self.static_url,
"from_base": self.from_base,
"google_analytics_id": self.settings.get("google_analytics_id"),
"ipywidgets_base_url": self.ipywidgets_base_url,
"jupyter_js_widgets_version": self.jupyter_js_widgets_version,
"jupyter_widgets_html_manager_version": self.jupyter_widgets_html_manager_version,
}
# Overwrite the static_url method from Tornado to work better with our custom StaticFileHandler
def static_url(self, url):
return url_path_join(self.static_url_prefix, url)
def breadcrumbs(self, path, base_url):
"""Generate a list of breadcrumbs"""
breadcrumbs = []
if not path:
return breadcrumbs
for name in path.split("/"):
base_url = url_path_join(base_url, name)
breadcrumbs.append({"url": base_url, "name": name})
return breadcrumbs
def get_page_links(self, response):
"""return prev_url, next_url for pagination
Response must be an HTTPResponse from a paginated GitHub API request.
Each will be None if there no such link.
"""
links = parse_header_links(response.headers.get("Link", ""))
next_url = prev_url = None
if "next" in links:
next_url = "?" + urlparse(links["next"]["url"]).query
if "prev" in links:
prev_url = "?" + urlparse(links["prev"]["url"]).query
return prev_url, next_url
# ---------------------------------------------------------------
# error handling
# ---------------------------------------------------------------
def client_error_message(self, exc, url, body, msg=None):
"""Turn the tornado HTTP error into something useful
Returns error code
"""
str_exc = str(exc)
# strip the unhelpful 599 prefix
if str_exc.startswith("HTTP 599: "):
str_exc = str_exc[10:]
if (msg is None) and body and len(body) < 100:
# if it's a short plain-text error message, include it
msg = "%s (%s)" % (str_exc, escape(body))
if not msg:
msg = str_exc
# Now get the error code
if exc.code == 599:
if isinstance(exc, CurlError):
en = getattr(exc, "errno", -1)
# can't connect to server should be 404
# possibly more here
if en in (pycurl.E_COULDNT_CONNECT, pycurl.E_COULDNT_RESOLVE_HOST):
code = 404
# otherwise, raise 400 with informative message:
code = 400
elif exc.code >= 500:
# 5XX, server error, but not this server
code = 502
else:
# client-side error, blame our client
if exc.code == 404:
code = 404
msg = "Remote %s" % msg
else:
code = 400
return code, msg
def reraise_client_error(self, exc):
"""Remote fetch raised an error"""
try:
url = exc.response.request.url.split("?")[0]
body = exc.response.body.decode("utf8", "replace").strip()
except AttributeError:
url = "url"
body = ""
code, msg = self.client_error_message(exc, url, body)
slim_body = escape(body[:300])
self.log.warn("Fetching %s failed with %s. Body=%s", url, msg, slim_body)
raise web.HTTPError(code, msg)
@contextmanager
def catch_client_error(self):
"""context manager for catching httpclient errors
they are transformed into appropriate web.HTTPErrors
"""
try:
yield
except httpclient.HTTPError as e:
self.reraise_client_error(e)
except socket.error as e:
raise web.HTTPError(404, str(e))
@property
def fetch_kwargs(self):
return self.settings.setdefault("fetch_kwargs", {})
async def fetch(self, url, **overrides):
"""fetch a url with our async client
handle default arguments and wrapping exceptions
"""
kw = {}
kw.update(self.fetch_kwargs)
kw.update(overrides)
with self.catch_client_error():
response = await self.client.fetch(url, **kw)
return response
def write_error(self, status_code, **kwargs):
"""render custom error pages"""
exc_info = kwargs.get("exc_info")
message = ""
status_message = responses.get(status_code, "Unknown")
if exc_info:
# get the custom message, if defined
exception = exc_info[1]
try:
message = exception.log_message % exception.args
except Exception:
pass
# construct the custom reason, if defined
reason = getattr(exception, "reason", "")
if reason:
status_message = reason
# build template namespace
namespace = dict(
status_code=status_code,
status_message=status_message,
message=message,
exception=exception,
)
# render the template
try:
html = self.render_status_code_template(status_code, **namespace)
except Exception as e:
html = self.render_error_template(**namespace)
self.set_header("Content-Type", "text/html")
self.write(html)
# ---------------------------------------------------------------
# response caching
# ---------------------------------------------------------------
@property
def cache_headers(self):
# are there other headers to cache?
h = {}
for key in ("Content-Type",):
if key in self._headers:
h[key] = self._headers[key]
return h
_cache_key = None
_cache_key_attr = "uri"
@property
def cache_key(self):
"""Use checksum for cache key because cache has size limit on keys
"""
if self._cache_key is None:
to_hash = utf8(getattr(self.request, self._cache_key_attr))
self._cache_key = hashlib.sha1(to_hash).hexdigest()
return self._cache_key
def truncate(self, s, limit=256):
"""Truncate long strings"""
if len(s) > limit:
s = "%s...%s" % (s[: limit // 2], s[limit // 2 :])
return s
async def cache_and_finish(self, content=""):
"""finish a request and cache the result
currently only works if:
- result is not written in multiple chunks
- custom headers are not used
"""
request_time = self.request.request_time()
# set cache expiry to 120x request time
# bounded by cache_expiry_min,max
# a 30 second render will be cached for an hour
expiry = max(
min(120 * request_time, self.cache_expiry_max), self.cache_expiry_min
)
if self.request.uri in self.max_cache_uris:
# if it's a link from the front page, cache for a long time
expiry = self.cache_expiry_max
if expiry > 0:
self.set_header("Cache-Control", "max-age=%i" % expiry)
self.write(content)
self.finish()
short_url = self.truncate(self.request.path)
cache_data = pickle.dumps(
{"headers": self.cache_headers, "body": content}, pickle.HIGHEST_PROTOCOL
)
log = self.log.info if expiry > self.cache_expiry_min else self.log.debug
log("Caching (expiry=%is) %s", expiry, short_url)
try:
with time_block("Cache set %s" % short_url, logger=self.log):
await self.cache.set(
self.cache_key, cache_data, int(time.time() + expiry)
)
except Exception:
self.log.error("Cache set for %s failed", short_url, exc_info=True)
else:
self.log.debug("Cache set finished %s", short_url)
def cached(method):
"""decorator for a cached page.
This only handles getting from the cache, not writing to it.
Writing to the cache must be handled in the decorated method.
"""
async def cached_method(self, *args, **kwargs):
uri = self.request.path
short_url = self.truncate(uri)
if self.get_argument("flush_cache", False):
await self.rate_limiter.check(self)
self.log.info("Flushing cache %s", short_url)
# call the wrapped method
await method(self, *args, **kwargs)
return
pending_future = self.pending.get(uri, None)
loop = IOLoop.current()
if pending_future:
self.log.info("Waiting for concurrent request at %s", short_url)
tic = loop.time()
await pending_future
toc = loop.time()
self.log.info(
"Waited %.3fs for concurrent request at %s", toc - tic, short_url
)
try:
with time_block("Cache get %s" % short_url, logger=self.log):
cached_pickle = await self.cache.get(self.cache_key)
if cached_pickle is not None:
cached = pickle.loads(cached_pickle)
else:
cached = None
except Exception as e:
self.log.error("Exception getting %s from cache", short_url, exc_info=True)
cached = None
if cached is not None:
self.log.info("Cache hit %s", short_url)
for key, value in cached["headers"].items():
self.set_header(key, value)
self.write(cached["body"])
else:
self.log.debug("Cache miss %s", short_url)
await self.rate_limiter.check(self)
future = self.pending[uri] = Future()
try:
# call the wrapped method
await method(self, *args, **kwargs)
finally:
self.pending.pop(uri, None)
# notify waiters
future.set_result(None)
return cached_method
class RenderingHandler(BaseHandler):
"""Base for handlers that render notebooks"""
# notebook caches based on path (no url params)
_cache_key_attr = "path"
@property
def render_timeout(self):
"""0 render_timeout means never finish early"""
return self.settings.setdefault("render_timeout", 0)
def initialize(self, **kwargs):
super().initialize(**kwargs)
loop = IOLoop.current()
if self.render_timeout:
self.slow_timeout = loop.add_timeout(
loop.time() + self.render_timeout, self.finish_early
)
def finish_early(self):
"""When the render is slow, draw a 'waiting' page instead
rely on the cache to deliver the page to a future request.
"""
if self._finished:
return
self.log.info("Finishing early %s", self.request.uri)
html = self.render_template("slow_notebook.html")
self.set_status(202) # Accepted
self.finish(html)
# short circuit some methods because the rest of the rendering will still happen
self.write = self.finish = self.redirect = lambda chunk=None: None
self.statsd.incr("rendering.waiting", 1)
def filter_formats(self, nb, raw):
"""Generate a list of formats that can render the given nb json
formats that do not provide a `test` method are assumed to work for
any notebook
"""
for name, format in self.formats.items():
test = format.get("test", None)
try:
if test is None or test(nb, raw):
yield (name, format)
except Exception as err:
self.log.info("Failed to test %s: %s", self.request.uri, name)
# empty methods to be implemented by subclasses to make GET requests more modular
def get_notebook_data(self, **kwargs):
"""
Pass as kwargs variables needed to define those variables which will be necessary for
the provider to find the notebook. (E.g. path for LocalHandler, user and repo for GitHub.)
Return variables the provider needs to find and load the notebook. Then run custom logic
in GET or pass the output of get_notebook_data immediately to deliver_notebook.
First part of any provider's GET method.
Custom logic, if applicable, is middle part of any provider's GET method, and usually
is implemented or overwritten in subclasses, while get_notebook_data and deliver_notebook
will often remain unchanged from the parent class (e.g. for a custom GitHub provider).
"""
pass
def deliver_notebook(self, **kwargs):
"""
Pass as kwargs the return values of get_notebook_data to this method. Get the JSON data
from the provider to render the notebook. Finish with a call to self.finish_notebook.
Last part of any provider's GET method.
"""
pass
# Wrappers to facilitate custom rendering in subclasses without having to rewrite entire GET methods
# This would seem to mostly involve creating different template namespaces to enable custom logic in
# extended templates, but there might be other possibilities
def render_notebook_template(
self, body, nb, download_url, json_notebook, **namespace
):
return self.render_template(
"formats/%s.html" % self.format,
body=body,
nb=nb,
download_url=download_url,
format=self.format,
default_format=self.default_format,
format_prefix=self.format_prefix,
formats=dict(self.filter_formats(nb, json_notebook)),
format_base=self.request.uri.replace(self.format_prefix, "").replace(
self.base_url, "/"
),
date=datetime.utcnow().strftime(self.date_fmt),
**namespace
)
async def finish_notebook(
self, json_notebook, download_url, msg=None, public=False, **namespace
):
"""Renders a notebook from its JSON body.
Parameters
----------
json_notebook: str
Notebook document in JSON format
download_url: str
URL to download the notebook document
msg: str, optional
Extra information to log when rendering fails
public: bool, optional
True if the notebook is public and its access indexed, False if not
"""
if msg is None:
msg = download_url
try:
parse_time = self.statsd.timer("rendering.parsing.time").start()
nb = reads(json_notebook, current_nbformat)
parse_time.stop()
except ValueError:
self.log.error("Failed to render %s", msg, exc_info=True)
self.statsd.incr("rendering.parsing.fail")
raise web.HTTPError(400, "Error reading JSON notebook")
try:
self.log.debug("Requesting render of %s", download_url)
with time_block(
"Rendered %s" % download_url, logger=self.log, debug_limit=0
):
self.log.info(
"Rendering %d B notebook from %s", len(json_notebook), download_url
)
render_time = self.statsd.timer("rendering.nbrender.time").start()
loop = asyncio.get_event_loop()
nbhtml, config = await loop.run_in_executor(
self.pool,
render_notebook,
self.formats[self.format],
nb,
download_url,
self.config,
)
render_time.stop()
except NbFormatError as e:
self.statsd.incr("rendering.nbrender.fail", 1)
self.log.error("Invalid notebook %s: %s", msg, e)
raise web.HTTPError(400, str(e))
except Exception as e:
self.statsd.incr("rendering.nbrender.fail", 1)
self.log.error("Failed to render %s", msg, exc_info=True)
raise web.HTTPError(400, str(e))
else:
self.statsd.incr("rendering.nbrender.success", 1)
self.log.debug("Finished render of %s", download_url)
html_time = self.statsd.timer("rendering.html.time").start()
html = self.render_notebook_template(
body=nbhtml,
nb=nb,
download_url=download_url,
json_notebook=json_notebook,
**namespace
)
html_time.stop()
if "content_type" in self.formats[self.format]:
self.set_header("Content-Type", self.formats[self.format]["content_type"])
await self.cache_and_finish(html)
# Index notebook
self.index.index_notebook(download_url, nb, public)
class FilesRedirectHandler(BaseHandler):
"""redirect files URLs without files prefix
matches behavior of old app, currently unused.
"""
def get(self, before_files, after_files):
self.log.info("Redirecting %s to %s", before_files, after_files)
self.redirect("%s/%s" % (before_files, after_files))
class AddSlashHandler(BaseHandler):
"""redirector for URLs that should always have trailing slash"""
def get(self, *args, **kwargs):
uri = self.request.path + "/"
if self.request.query:
uri = "%s?%s" % (uri, self.request.query)
self.redirect(uri)
class RemoveSlashHandler(BaseHandler):
"""redirector for URLs that should never have trailing slash"""
def get(self, *args, **kwargs):
uri = self.request.path.rstrip("/")
if self.request.query:
uri = "%s?%s" % (uri, self.request.query)
self.redirect(uri)

View File

@ -0,0 +1 @@
from .handlers import uri_rewrites

View File

@ -0,0 +1,15 @@
# -----------------------------------------------------------------------------
# Copyright (C) Jupyter Development Team
#
# Distributed under the terms of the BSD License. The full license is in
# the file COPYING, distributed as part of this software.
# -----------------------------------------------------------------------------
def uri_rewrites(rewrites=[]):
return rewrites + [
(
r"^http(s?)://www.dropbox.com/(sh?)/(.+?)(\?dl=.)*$",
u"/url{0}/dl.dropbox.com/{1}/{2}",
)
]

View File

@ -0,0 +1,2 @@
from .handlers import default_handlers
from .handlers import uri_rewrites

View File

@ -0,0 +1,363 @@
# Copyright (c) Jupyter Development Team.
# Distributed under the terms of the Modified BSD License.
import json
import os
from tornado import web
from .. import _load_handler_from_location
from ...utils import clean_filename
from ...utils import quote
from ...utils import response_text
from ...utils import url_path_join
from ..base import BaseHandler
from ..base import cached
from ..base import RenderingHandler
from ..github.handlers import GithubClientMixin
class GistClientMixin(GithubClientMixin):
# PROVIDER_CTX is a dictionary whose entries are passed as keyword arguments
# to the render_template method of the GistHandler. The following describe
# the information contained in each of these keyword arguments:
# provider_label: str
# Text to to apply to the navbar icon linking to the provider
# provider_icon: str
# CSS classname to apply to the navbar icon linking to the provider
# executor_label: str, optional
# Text to apply to the navbar icon linking to the execution service
# executor_icon: str, optional
# CSS classname to apply to the navbar icon linking to the execution service
PROVIDER_CTX = {
"provider_label": "Gist",
"provider_icon": "github-square",
"executor_label": "Binder",
"executor_icon": "icon-binder",
}
BINDER_TMPL = "{binder_base_url}/gist/{user}/{gist_id}/master"
BINDER_PATH_TMPL = BINDER_TMPL + "?filepath={path}"
def client_error_message(self, exc, url, body, msg=None):
if exc.code == 403 and "too big" in body.lower():
return 400, "GitHub will not serve raw gists larger than 10MB"
return super().client_error_message(exc, url, body, msg)
class UserGistsHandler(GistClientMixin, BaseHandler):
"""list a user's gists containing notebooks
.ipynb file extension is required for listing (not for rendering).
"""
def render_usergists_template(
self, entries, user, provider_url, prev_url, next_url, **namespace
):
"""
provider_url: str
URL to the notebook document upstream at the provider (e.g., GitHub)
executor_url: str, optional (kwarg passed into `namespace`)
URL to execute the notebook document (e.g., Binder)
"""
return self.render_template(
"usergists.html",
entries=entries,
user=user,
provider_url=provider_url,
prev_url=prev_url,
next_url=next_url,
**self.PROVIDER_CTX,
**namespace
)
@cached
async def get(self, user, **namespace):
page = self.get_argument("page", None)
params = {}
if page:
params["page"] = page
with self.catch_client_error():
response = await self.github_client.get_gists(user, params=params)
prev_url, next_url = self.get_page_links(response)
gists = json.loads(response_text(response))
entries = []
for gist in gists:
notebooks = [f for f in gist["files"] if f.endswith(".ipynb")]
if notebooks:
entries.append(
dict(
id=gist["id"],
notebooks=notebooks,
description=gist["description"] or "",
)
)
if self.github_url == "https://github.com/":
gist_base_url = "https://gist.github.com/"
else:
gist_base_url = url_path_join(self.github_url, "gist/")
provider_url = url_path_join(gist_base_url, u"{user}".format(user=user))
html = self.render_usergists_template(
entries=entries,
user=user,
provider_url=provider_url,
prev_url=prev_url,
next_url=next_url,
**namespace
)
await self.cache_and_finish(html)
class GistHandler(GistClientMixin, RenderingHandler):
"""render a gist notebook, or list files if a multifile gist"""
async def parse_gist(self, user, gist_id, filename=""):
with self.catch_client_error():
response = await self.github_client.get_gist(gist_id)
gist = json.loads(response_text(response))
gist_id = gist["id"]
if user is None:
# redirect to /gist/user/gist_id if no user given
owner_dict = gist.get("owner", {})
if owner_dict:
user = owner_dict["login"]
else:
user = "anonymous"
new_url = u"{format}/gist/{user}/{gist_id}".format(
format=self.format_prefix, user=user, gist_id=gist_id
)
if filename:
new_url = new_url + "/" + filename
self.redirect(self.from_base(new_url))
return
files = gist["files"]
many_files_gist = len(files) > 1
# user and gist_id get modified
return user, gist_id, gist, files, many_files_gist
# Analogous to GitHubTreeHandler
async def tree_get(self, user, gist_id, gist, files):
"""
user, gist_id, gist, and files are (most) of the values returned by parse_gist
"""
entries = []
ipynbs = []
others = []
for file in files.values():
e = {}
e["name"] = file["filename"]
if file["filename"].endswith(".ipynb"):
e["url"] = quote("/%s/%s" % (gist_id, file["filename"]))
e["class"] = "fa-book"
ipynbs.append(e)
else:
if self.github_url == "https://github.com/":
gist_base_url = "https://gist.github.com/"
else:
gist_base_url = url_path_join(self.github_url, "gist/")
provider_url = url_path_join(
gist_base_url,
u"{user}/{gist_id}#file-{clean_name}".format(
user=user,
gist_id=gist_id,
clean_name=clean_filename(file["filename"]),
),
)
e["url"] = provider_url
e["class"] = "fa-share"
others.append(e)
entries.extend(ipynbs)
entries.extend(others)
# Enable a binder navbar icon if a binder base URL is configured
executor_url = (
self.BINDER_TMPL.format(
binder_base_url=self.binder_base_url,
user=user.rstrip("/"),
gist_id=gist_id,
)
if self.binder_base_url
else None
)
# provider_url:
# URL to the notebook document upstream at the provider (e.g., GitHub)
# executor_url: str, optional
# URL to execute the notebook document (e.g., Binder)
html = self.render_template(
"treelist.html",
entries=entries,
tree_type="gist",
tree_label="gists",
user=user.rstrip("/"),
provider_url=gist["html_url"],
executor_url=executor_url,
**self.PROVIDER_CTX
)
await self.cache_and_finish(html)
# Analogous to GitHubBlobHandler
async def file_get(self, user, gist_id, filename, gist, many_files_gist, file):
content = await self.get_notebook_data(gist_id, filename, many_files_gist, file)
if not content:
return
await self.deliver_notebook(user, gist_id, filename, gist, file, content)
# Only called by file_get
async def get_notebook_data(self, gist_id, filename, many_files_gist, file):
"""
gist_id, filename, many_files_gist, file are all passed to file_get
"""
if (file["type"] or "").startswith("image/"):
self.log.debug(
"Fetching raw image (%s) %s/%s: %s",
file["type"],
gist_id,
filename,
file["raw_url"],
)
response = await self.fetch(file["raw_url"])
# use raw bytes for images:
content = response.body
elif file["truncated"]:
self.log.debug(
"Gist %s/%s truncated, fetching %s", gist_id, filename, file["raw_url"]
)
response = await self.fetch(file["raw_url"])
content = response_text(response, encoding="utf-8")
else:
content = file["content"]
if many_files_gist and not filename.endswith(".ipynb"):
self.set_header("Content-Type", file.get("type") or "text/plain")
# cannot redirect because of X-Frame-Content
self.finish(content)
return
else:
return content
# Only called by file_get
async def deliver_notebook(self, user, gist_id, filename, gist, file, content):
"""
user, gist_id, filename, gist, file, are the same values as those
passed into file_get, whereas content is returned from
get_notebook_data using user, gist_id, filename, gist, and file.
"""
# Enable a binder navbar icon if a binder base URL is configured
executor_url = (
self.BINDER_PATH_TMPL.format(
binder_base_url=self.binder_base_url,
user=user.rstrip("/"),
gist_id=gist_id,
path=quote(filename),
)
if self.binder_base_url
else None
)
# provider_url: str, optional
# URL to the notebook document upstream at the provider (e.g., GitHub)
await self.finish_notebook(
content,
file["raw_url"],
msg="gist: %s" % gist_id,
public=gist["public"],
provider_url=gist["html_url"],
executor_url=executor_url,
**self.PROVIDER_CTX
)
@cached
async def get(self, user, gist_id, filename=""):
"""
Encompasses both the case of a single file gist, handled by
`file_get`, as well as a many-file gist, handled by `tree_get`.
"""
parsed_gist = await self.parse_gist(user, gist_id, filename)
if parsed_gist is not None:
user, gist_id, gist, files, many_files_gist = parsed_gist
else:
return
if many_files_gist and not filename:
await self.tree_get(user, gist_id, gist, files)
else:
if not many_files_gist and not filename:
filename = list(files.keys())[0]
if filename not in files:
raise web.HTTPError(
404, "No such file in gist: %s (%s)", filename, list(files.keys())
)
file = files[filename]
await self.file_get(user, gist_id, filename, gist, many_files_gist, file)
class GistRedirectHandler(BaseHandler):
"""redirect old /<gist-id> to new /gist/<gist-id>"""
def get(self, gist_id, file=""):
new_url = "%s/gist/%s" % (self.format_prefix, gist_id)
if file:
new_url = "%s/%s" % (new_url, file)
self.log.info("Redirecting %s to %s", self.request.uri, new_url)
self.redirect(self.from_base(new_url))
def default_handlers(handlers=[], **handler_names):
"""Tornado handlers"""
gist_handler = _load_handler_from_location(handler_names["gist_handler"])
user_gists_handler = _load_handler_from_location(
handler_names["user_gists_handler"]
)
return handlers + [
(r"/gist/([^\/]+/)?([0-9]+|[0-9a-f]{20,})", gist_handler, {}),
(r"/gist/([^\/]+/)?([0-9]+|[0-9a-f]{20,})/(?:files/)?(.*)", gist_handler, {}),
(r"/([0-9]+|[0-9a-f]{20,})", GistRedirectHandler, {}),
(r"/([0-9]+|[0-9a-f]{20,})/(.*)", GistRedirectHandler, {}),
(r"/gist/([^\/]+)/?", user_gists_handler, {}),
]
def uri_rewrites(rewrites=[]):
gist_rewrites = [
(r"^([a-f0-9]+)/?$", u"/{0}"),
(r"^https?://gist.github.com/([^\/]+/)?([a-f0-9]+)/?$", u"/{1}"),
]
# github enterprise
if os.environ.get("GITHUB_API_URL", "") != "":
gist_base_url = url_path_join(
os.environ.get("GITHUB_API_URL").split("/api/v3")[0], "gist/"
)
gist_rewrites.extend(
[
# Fetching the Gist ID which is embedded in the URL, but with a different base URL
(r"^" + gist_base_url + r"([^\/]+/)?([a-f0-9]+)/?$", u"/{1}")
]
)
return gist_rewrites + rewrites

View File

@ -0,0 +1,77 @@
# -*- coding: utf-8 -*-
# -----------------------------------------------------------------------------
# Copyright (C) Jupyter Development Team
#
# Distributed under the terms of the BSD License. The full license is in
# the file COPYING, distributed as part of this software.
# -----------------------------------------------------------------------------
import requests
from ....tests.base import FormatHTMLMixin
from ....tests.base import NBViewerTestCase
from ....tests.base import skip_unless_github_auth
class GistTestCase(NBViewerTestCase):
@skip_unless_github_auth
def test_gist(self):
url = self.url("2352771")
r = requests.get(url)
self.assertEqual(r.status_code, 200)
@skip_unless_github_auth
def test_gist_not_nb(self):
url = self.url("6689377")
r = requests.get(url)
self.assertEqual(r.status_code, 400)
@skip_unless_github_auth
def test_gist_no_such_file(self):
url = self.url("6689377/no/file.ipynb")
r = requests.get(url)
self.assertEqual(r.status_code, 404)
@skip_unless_github_auth
def test_gist_list(self):
url = self.url("7518294")
r = requests.get(url)
self.assertEqual(r.status_code, 200)
html = r.text
self.assertIn("<th>Name</th>", html)
@skip_unless_github_auth
def test_multifile_gist(self):
url = self.url("7518294", "Untitled0.ipynb")
r = requests.get(url)
self.assertEqual(r.status_code, 200)
html = r.text
self.assertIn("Download Notebook", html)
@skip_unless_github_auth
def test_anonymous_gist(self):
url = self.url("gist/4465051")
r = requests.get(url)
self.assertEqual(r.status_code, 200)
html = r.text
self.assertIn("Download Notebook", html)
@skip_unless_github_auth
def test_gist_unicode(self):
url = self.url("gist/amueller/3974344")
r = requests.get(url)
self.assertEqual(r.status_code, 200)
html = r.text
self.assertIn("<th>Name</th>", html)
@skip_unless_github_auth
def test_gist_unicode_content(self):
url = self.url("gist/ocefpaf/cf023a8db7097bd9fe92")
r = requests.get(url)
self.assertEqual(r.status_code, 200)
html = r.text
self.assertNotIn("param&#195;&#169;trica", html)
self.assertIn("param&#233;trica", html)
class FormatHTMLGistTestCase(GistTestCase, FormatHTMLMixin):
pass

View File

@ -0,0 +1,2 @@
from .handlers import default_handlers
from .handlers import uri_rewrites

View File

@ -0,0 +1,181 @@
# -----------------------------------------------------------------------------
# Copyright (C) Jupyter Development Team
#
# Distributed under the terms of the BSD License. The full license is in
# the file COPYING, distributed as part of this software.
# -----------------------------------------------------------------------------
import json
import os
from urllib.parse import urlparse
from tornado.httpclient import AsyncHTTPClient
from tornado.httpclient import HTTPError
from tornado.httputil import url_concat
from ...utils import quote
from ...utils import response_text
from ...utils import url_path_join
# -----------------------------------------------------------------------------
# Async GitHub Client
# -----------------------------------------------------------------------------
class AsyncGitHubClient(object):
"""AsyncHTTPClient wrapper with methods for common requests"""
auth = None
def __init__(self, log, client=None):
self.log = log
self.client = client or AsyncHTTPClient()
self.github_api_url = os.environ.get(
"GITHUB_API_URL", "https://api.github.com/"
)
self.authenticate()
def authenticate(self):
self.auth = {
"client_id": os.environ.get("GITHUB_OAUTH_KEY", ""),
"client_secret": os.environ.get("GITHUB_OAUTH_SECRET", ""),
"access_token": os.environ.get("GITHUB_API_TOKEN", ""),
}
def fetch(self, url, params=None, **kwargs):
"""Add GitHub auth to self.client.fetch"""
if not url.startswith(self.github_api_url):
raise ValueError("Only fetch GitHub urls with GitHub auth (%s)" % url)
params = {} if params is None else params
kwargs.setdefault("user_agent", "Tornado-Async-GitHub-Client")
if self.auth["client_id"] and self.auth["client_secret"]:
kwargs["auth_username"] = self.auth["client_id"]
kwargs["auth_password"] = self.auth["client_secret"]
if self.auth["access_token"]:
headers = kwargs.setdefault("headers", {})
headers["Authorization"] = "token " + self.auth["access_token"]
url = url_concat(url, params)
future = self.client.fetch(url, **kwargs)
future.add_done_callback(self._log_rate_limit)
return future
def _log_rate_limit(self, future):
"""log GitHub rate limit headers
- error if 0 remaining
- warn if 10% or less remain
- debug otherwise
"""
try:
r = future.result()
except HTTPError as e:
r = e.response
if r is None:
# some errors don't have a response (e.g. failure to build request)
return
limit_s = r.headers.get("X-RateLimit-Limit", "")
remaining_s = r.headers.get("X-RateLimit-Remaining", "")
if not remaining_s or not limit_s:
if r.code < 300:
self.log.warn(
"No rate limit headers. Did GitHub change? %s",
json.dumps(dict(r.headers), indent=1),
)
return
remaining = int(remaining_s)
limit = int(limit_s)
if remaining == 0 and r.code >= 400:
text = response_text(r)
try:
message = json.loads(text)["message"]
except Exception:
# Can't extract message, log full reply
message = text
self.log.error("GitHub rate limit (%s) exceeded: %s", limit, message)
return
if 10 * remaining > limit:
log = self.log.info
else:
log = self.log.warn
log("%i/%i GitHub API requests remaining", remaining, limit)
def github_api_request(self, path, **kwargs):
"""Make a GitHub API request to URL
URL is constructed from url and params, if specified.
**kwargs are passed to client.fetch unmodified.
"""
url = url_path_join(self.github_api_url, quote(path))
return self.fetch(url, **kwargs)
def get_gist(self, gist_id, **kwargs):
"""Get a gist"""
path = u"gists/{}".format(gist_id)
return self.github_api_request(path, **kwargs)
def get_contents(self, user, repo, path, ref=None, **kwargs):
"""Make contents API request - either file contents or directory listing"""
path = u"repos/{user}/{repo}/contents/{path}".format(**locals())
if ref is not None:
params = kwargs.setdefault("params", {})
params["ref"] = ref
return self.github_api_request(path, **kwargs)
def get_repos(self, user, **kwargs):
"""List a user's repos"""
path = u"users/{user}/repos".format(user=user)
return self.github_api_request(path, **kwargs)
def get_gists(self, user, **kwargs):
"""List a user's gists"""
path = u"users/{user}/gists".format(user=user)
return self.github_api_request(path, **kwargs)
def get_repo(self, user, repo, **kwargs):
"""List a repo's branches"""
path = u"repos/{user}/{repo}".format(user=user, repo=repo)
return self.github_api_request(path, **kwargs)
def get_tree(self, user, repo, path, ref="master", recursive=False, **kwargs):
"""Get a git tree"""
# only need a recursive fetch if it's not in the top-level dir
if "/" in path:
recursive = True
path = u"repos/{user}/{repo}/git/trees/{ref}".format(**locals())
if recursive:
params = kwargs.setdefault("params", {})
params["recursive"] = True
tree = self.github_api_request(path, **kwargs)
return tree
def get_branches(self, user, repo, **kwargs):
"""List a repo's branches"""
path = u"repos/{user}/{repo}/branches".format(user=user, repo=repo)
return self.github_api_request(path, **kwargs)
def get_tags(self, user, repo, **kwargs):
"""List a repo's branches"""
path = u"repos/{user}/{repo}/tags".format(user=user, repo=repo)
return self.github_api_request(path, **kwargs)
def extract_tree_entry(self, path, tree_response):
"""extract a single tree entry from
a tree response using for a path
raises 404 if not found
Useful for finding the blob url for a given path.
"""
tree_response.rethrow()
self.log.debug(tree_response)
jsondata = response_text(tree_response)
data = json.loads(jsondata)
for entry in data["tree"]:
if entry["path"] == path:
return entry
raise HTTPError(404, "%s not found among %i files" % (path, len(data["tree"])))

View File

@ -0,0 +1,554 @@
# -----------------------------------------------------------------------------
# Copyright (C) Jupyter Development Team
#
# Distributed under the terms of the BSD License. The full license is in
# the file COPYING, distributed as part of this software.
# -----------------------------------------------------------------------------
import json
import mimetypes
import os
import re
from tornado import web
from tornado.escape import url_unescape
from .. import _load_handler_from_location
from ...utils import base64_decode
from ...utils import quote
from ...utils import response_text
from ...utils import url_path_join
from ..base import AddSlashHandler
from ..base import BaseHandler
from ..base import cached
from ..base import RemoveSlashHandler
from ..base import RenderingHandler
from .client import AsyncGitHubClient
class GithubClientMixin(object):
# PROVIDER_CTX is a dictionary whose entries are passed as keyword arguments
# to the render_template method of the GistHandler. The following describe
# the information contained in each of these keyword arguments:
# provider_label: str
# Text to to apply to the navbar icon linking to the provider
# provider_icon: str
# CSS classname to apply to the navbar icon linking to the provider
# executor_label: str, optional
# Text to apply to the navbar icon linking to the execution service
# executor_icon: str, optional
# CSS classname to apply to the navbar icon linking to the execution service
PROVIDER_CTX = {
"provider_label": "GitHub",
"provider_icon": "github",
"executor_label": "Binder",
"executor_icon": "icon-binder",
}
BINDER_TMPL = "{binder_base_url}/gh/{org}/{repo}/{ref}"
BINDER_PATH_TMPL = BINDER_TMPL + "?filepath={path}"
@property
def github_url(self):
if getattr(self, "_github_url", None) is None:
if os.environ.get("GITHUB_URL", ""):
self._github_url = os.environ.get("GITHUB_URL")
elif self.github_client.github_api_url == "https://api.github.com/":
self._github_url = "https://github.com/"
else:
# Github Enterprise
# https://developer.github.com/enterprise/2.18/v3/enterprise-admin/#endpoint-urls
self._github_url = re.sub(
r"api/v3/$", "", self.github_client.github_api_url
)
return self._github_url
@property
def github_client(self):
"""Create an upgraded github API client from the HTTP client"""
if getattr(self, "_github_client", None) is None:
self._github_client = AsyncGitHubClient(self.log, self.client)
return self._github_client
def client_error_message(self, exc, url, body, msg=None):
if exc.code == 403 and "rate limit" in body.lower():
return 503, "GitHub API rate limit exceeded. Try again soon."
return super().client_error_message(exc, url, body, msg)
class RawGitHubURLHandler(BaseHandler):
"""redirect old /urls/raw.github urls to /github/ API urls"""
def get(self, user, repo, path):
new_url = u"{format}/github/{user}/{repo}/blob/{path}".format(
format=self.format_prefix, user=user, repo=repo, path=path
)
self.log.info("Redirecting %s to %s", self.request.uri, new_url)
self.redirect(self.from_base(new_url))
class GitHubRedirectHandler(GithubClientMixin, BaseHandler):
"""redirect github urls to /github/ API urls"""
def get(self, url):
new_url = u"{format}/github/{url}".format(format=self.format_prefix, url=url)
self.log.info("Redirecting %s to %s", self.request.uri, new_url)
self.redirect(self.from_base(new_url))
class GitHubUserHandler(GithubClientMixin, BaseHandler):
"""list a user's github repos"""
def render_github_user_template(
self, entries, provider_url, next_url, prev_url, **namespace
):
return self.render_template(
"userview.html",
entries=entries,
provider_url=provider_url,
next_url=next_url,
prev_url=prev_url,
**self.PROVIDER_CTX,
**namespace
)
@cached
async def get(self, user):
page = self.get_argument("page", None)
params = {"sort": "updated"}
if page:
params["page"] = page
with self.catch_client_error():
response = await self.github_client.get_repos(user, params=params)
prev_url, next_url = self.get_page_links(response)
repos = json.loads(response_text(response))
entries = []
for repo in repos:
entries.append(dict(url=repo["name"], name=repo["name"]))
provider_url = u"{github_url}{user}".format(
user=user, github_url=self.github_url
)
html = self.render_github_user_template(
entries=entries,
provider_url=provider_url,
next_url=next_url,
prev_url=prev_url,
)
await self.cache_and_finish(html)
class GitHubRepoHandler(GithubClientMixin, BaseHandler):
"""redirect /github/user/repo to .../tree/master"""
async def get(self, user, repo):
response = await self.github_client.get_repo(user, repo)
default_branch = json.loads(response_text(response))["default_branch"]
new_url = self.from_base(
"/", self.format_prefix, "github", user, repo, "tree", default_branch
)
self.log.info("Redirecting %s to %s", self.request.uri, new_url)
self.redirect(new_url)
class GitHubTreeHandler(GithubClientMixin, BaseHandler):
"""list files in a github repo (like github tree)"""
def render_treelist_template(
self,
entries,
breadcrumbs,
provider_url,
user,
repo,
ref,
path,
branches,
tags,
executor_url,
**namespace
):
"""
breadcrumbs: list of dict
Breadcrumb 'name' and 'url' to render as links at the top of the notebook page
provider_url: str
URL to the notebook document upstream at the provider (e.g., GitHub)
executor_url: str, optional
URL to execute the notebook document (e.g., Binder)
"""
return self.render_template(
"treelist.html",
entries=entries,
breadcrumbs=breadcrumbs,
provider_url=provider_url,
user=user,
repo=repo,
ref=ref,
path=path,
branches=branches,
tags=tags,
tree_type="github",
tree_label="repositories",
executor_url=executor_url,
**self.PROVIDER_CTX,
**namespace
)
@cached
async def get(self, user, repo, ref, path):
if not self.request.uri.endswith("/"):
self.redirect(self.request.uri + "/")
return
path = path.rstrip("/")
with self.catch_client_error():
response = await self.github_client.get_contents(user, repo, path, ref=ref)
contents = json.loads(response_text(response))
branches, tags = await self.refs(user, repo)
for nav_ref in branches + tags:
nav_ref["url"] = u"/github/{user}/{repo}/tree/{ref}/{path}".format(
ref=nav_ref["name"], user=user, repo=repo, path=path
)
if not isinstance(contents, list):
self.log.info(
"{format}/{user}/{repo}/{ref}/{path} not tree, redirecting to blob",
extra=dict(
format=self.format_prefix, user=user, repo=repo, ref=ref, path=path
),
)
self.redirect(
u"{format}/github/{user}/{repo}/blob/{ref}/{path}".format(
format=self.format_prefix, user=user, repo=repo, ref=ref, path=path
)
)
return
# Account for possibility that GitHub API redirects us to get more accurate breadcrumbs
# See: https://github.com/jupyter/nbviewer/issues/324
example_file_url = contents[0]["html_url"]
user, repo = re.match(
r"^" + self.github_url + "(?P<user>[^\/]+)/(?P<repo>[^\/]+)/.*",
example_file_url,
).group("user", "repo")
base_url = u"/github/{user}/{repo}/tree/{ref}".format(
user=user, repo=repo, ref=ref
)
provider_url = u"{github_url}{user}/{repo}/tree/{ref}/{path}".format(
user=user, repo=repo, ref=ref, path=path, github_url=self.github_url
)
breadcrumbs = [{"url": base_url, "name": repo}]
breadcrumbs.extend(self.breadcrumbs(path, base_url))
entries = []
dirs = []
ipynbs = []
others = []
for file in contents:
e = {}
e["name"] = file["name"]
if file["type"] == "dir":
e["url"] = u"/github/{user}/{repo}/tree/{ref}/{path}".format(
user=user, repo=repo, ref=ref, path=file["path"]
)
e["url"] = quote(e["url"])
e["class"] = "fa-folder-open"
dirs.append(e)
elif file["name"].endswith(".ipynb"):
e["url"] = u"/github/{user}/{repo}/blob/{ref}/{path}".format(
user=user, repo=repo, ref=ref, path=file["path"]
)
e["url"] = quote(e["url"])
e["class"] = "fa-book"
ipynbs.append(e)
elif file["html_url"]:
e["url"] = file["html_url"]
e["class"] = "fa-share"
others.append(e)
else:
# submodules don't have html_url
e["url"] = ""
e["class"] = "fa-folder-close"
others.append(e)
entries.extend(dirs)
entries.extend(ipynbs)
entries.extend(others)
# Enable a binder navbar icon if a binder base URL is configured
executor_url = (
self.BINDER_TMPL.format(
binder_base_url=self.binder_base_url, org=user, repo=repo, ref=ref
)
if self.binder_base_url
else None
)
html = self.render_treelist_template(
entries=entries,
breadcrumbs=breadcrumbs,
provider_url=provider_url,
user=user,
repo=repo,
ref=ref,
path=path,
branches=branches,
tags=tags,
executor_url=executor_url,
)
await self.cache_and_finish(html)
async def refs(self, user, repo):
"""get branches and tags for this user/repo"""
ref_types = ("branches", "tags")
ref_data = [None, None]
for i, ref_type in enumerate(ref_types):
with self.catch_client_error():
response = await getattr(self.github_client, "get_%s" % ref_type)(
user, repo
)
ref_data[i] = json.loads(response_text(response))
return ref_data
class GitHubBlobHandler(GithubClientMixin, RenderingHandler):
"""handler for files on github
If it's a...
- notebook, render it
- non-notebook file, serve file unmodified
- directory, redirect to tree
"""
async def get_notebook_data(self, user, repo, ref, path):
if os.environ.get("GITHUB_API_URL", "") == "":
raw_url = u"https://raw.githubusercontent.com/{user}/{repo}/{ref}/{path}".format(
user=user, repo=repo, ref=ref, path=quote(path)
)
else: # Github Enterprise has a different URL pattern for accessing raw files
raw_url = url_path_join(
self.github_url, user, repo, "raw", ref, quote(path)
)
blob_url = u"{github_url}{user}/{repo}/blob/{ref}/{path}".format(
user=user, repo=repo, ref=ref, path=quote(path), github_url=self.github_url
)
with self.catch_client_error():
tree = await self.github_client.get_tree(
user, repo, path=url_unescape(path), ref=ref
)
tree_entry = self.github_client.extract_tree_entry(
path=url_unescape(path), tree_response=tree
)
if tree_entry["type"] == "tree":
tree_url = "/github/{user}/{repo}/tree/{ref}/{path}/".format(
user=user, repo=repo, ref=ref, path=quote(path)
)
self.log.info(
"%s is a directory, redirecting to %s", self.request.path, tree_url
)
self.redirect(tree_url)
return
return raw_url, blob_url, tree_entry
async def deliver_notebook(
self, user, repo, ref, path, raw_url, blob_url, tree_entry
):
# fetch file data from the blobs API
with self.catch_client_error():
response = await self.github_client.fetch(tree_entry["url"])
data = json.loads(response_text(response))
contents = data["content"]
if data["encoding"] == "base64":
# filedata will be bytes
filedata = base64_decode(contents)
else:
# filedata will be unicode
filedata = contents
if path.endswith(".ipynb"):
dir_path = path.rsplit("/", 1)[0]
base_url = "/github/{user}/{repo}/tree/{ref}".format(
user=user, repo=repo, ref=ref
)
breadcrumbs = [{"url": base_url, "name": repo}]
breadcrumbs.extend(self.breadcrumbs(dir_path, base_url))
# Enable a binder navbar icon if a binder base URL is configured
executor_url = (
self.BINDER_PATH_TMPL.format(
binder_base_url=self.binder_base_url,
org=user,
repo=repo,
ref=ref,
path=quote(path),
)
if self.binder_base_url
else None
)
try:
# filedata may be bytes, but we need text
if isinstance(filedata, bytes):
nbjson = filedata.decode("utf-8")
else:
nbjson = filedata
except Exception as e:
self.log.error("Failed to decode notebook: %s", raw_url, exc_info=True)
raise web.HTTPError(400)
# Explanation of some kwargs passed into `finish_notebook`:
# provider_url:
# URL to the notebook document upstream at the provider (e.g., GitHub)
# breadcrumbs: list of dict
# Breadcrumb 'name' and 'url' to render as links at the top of the notebook page
# executor_url: str, optional
# URL to execute the notebook document (e.g., Binder)
await self.finish_notebook(
nbjson,
raw_url,
provider_url=blob_url,
executor_url=executor_url,
breadcrumbs=breadcrumbs,
msg="file from GitHub: %s" % raw_url,
public=True,
**self.PROVIDER_CTX
)
else:
mime, enc = mimetypes.guess_type(path)
self.set_header("Content-Type", mime or "text/plain")
await self.cache_and_finish(filedata)
@cached
async def get(self, user, repo, ref, path):
notebook_data = await self.get_notebook_data(user, repo, ref, path)
if notebook_data is not None:
raw_url, blob_url, tree_entry = notebook_data
else:
return
await self.deliver_notebook(
user, repo, ref, path, raw_url, blob_url, tree_entry
)
def default_handlers(handlers=[], **handler_names):
"""Tornado handlers"""
blob_handler = _load_handler_from_location(handler_names["github_blob_handler"])
tree_handler = _load_handler_from_location(handler_names["github_tree_handler"])
user_handler = _load_handler_from_location(handler_names["github_user_handler"])
return (
[
# ideally these URIs should have been caught by an appropriate
# uri_rewrite rather than letting the url provider catch them and then
# fixing it here.
# There are probably links in the wild that depend on these, so keep
# these handlers for backwards compatibility.
(r"/url[s]?/github\.com/(?P<url>.*)", GitHubRedirectHandler, {}),
(
r"/url[s]?/raw\.?github\.com/(?P<user>[^\/]+)/(?P<repo>[^\/]+)/(?P<path>.*)",
RawGitHubURLHandler,
{},
),
(
r"/url[s]?/raw\.?githubusercontent\.com/(?P<user>[^\/]+)/(?P<repo>[^\/]+)/(?P<path>.*)",
RawGitHubURLHandler,
{},
),
]
+ handlers
+ [
(r"/github/([^\/]+)", AddSlashHandler, {}),
(r"/github/(?P<user>[^\/]+)/", user_handler, {}),
(r"/github/([^\/]+)/([^\/]+)", AddSlashHandler, {}),
(r"/github/(?P<user>[^\/]+)/(?P<repo>[^\/]+)/", GitHubRepoHandler, {}),
(
r"/github/([^\/]+)/([^\/]+)/(?:blob|raw)/([^\/]+)/(.*)/",
RemoveSlashHandler,
{},
),
(r"/github/([^\/]+)/([^\/]+)/tree/([^\/]+)", AddSlashHandler, {}),
]
+ [
(
r"/github/(?P<user>[^\/]+)/(?P<repo>[^\/]+)/tree/(?P<ref>[^\/]+)/(?P<path>.*)",
tree_handler,
{},
),
(
r"/github/(?P<user>[^\/]+)/(?P<repo>[^\/]+)/(?:blob|raw)/(?P<ref>[^\/]+)/(?P<path>.*)",
blob_handler,
{},
),
]
)
def uri_rewrites(rewrites=[]):
github_rewrites = [
# three different uris for a raw view
(
r"^https?://github\.com/([^\/]+)/([^\/]+)/raw/([^\/]+)/(.*)",
u"/github/{0}/{1}/blob/{2}/{3}",
),
(
r"^https?://raw\.github\.com/([^\/]+)/([^\/]+)/(.*)",
u"/github/{0}/{1}/blob/{2}",
),
(
r"^https?://raw\.githubusercontent\.com/([^\/]+)/([^\/]+)/(.*)",
u"/github/{0}/{1}/blob/{2}",
),
# trees & blobs
(
r"^https?://github.com/([\w\-]+)/([^\/]+)/(blob|tree)/(.*)$",
u"/github/{0}/{1}/{2}/{3}",
),
# user/repo
(r"^([\w\-]+)/([^\/]+)$", u"/github/{0}/{1}/tree/master/"),
# user
(r"^([\w\-]+)$", u"/github/{0}/"),
]
# github enterprise
if os.environ.get("GITHUB_API_URL", "") != "":
github_base_url = os.environ.get("GITHUB_API_URL").split("api/v3")[0]
github_rewrites.extend(
[
# raw view
(
r"^" + github_base_url + r"([^\/]+)/([^\/]+)/raw/([^\/]+)/(.*)",
u"/github/{0}/{1}/blob/{2}/{3}",
),
# trees & blobs
(
r"^" + github_base_url + r"([\w\-]+)/([^\/]+)/(blob|tree)/(.*)$",
u"/github/{0}/{1}/{2}/{3}",
),
# user/repo
(
r"^" + github_base_url + r"([\w\-]+)/([^\/]+)/?$",
u"/github/{0}/{1}/tree/master",
),
# user
(r"^" + github_base_url + r"([\w\-]+)/?$", u"/github/{0}/"),
]
)
return rewrites + github_rewrites

View File

@ -0,0 +1,112 @@
# encoding: utf-8
import unittest.mock as mock
from tornado.httpclient import AsyncHTTPClient
from tornado.log import app_log
from tornado.testing import AsyncTestCase
from ....utils import quote
from ..client import AsyncGitHubClient
class GithubClientTest(AsyncTestCase):
"""Tests that the github API client makes the correct http requests."""
def setUp(self):
super().setUp()
# Need a mock HTTPClient for the github client to talk to.
self.http_client = mock.create_autospec(AsyncHTTPClient)
# patch the enviornment so that we get a known url prefix.
with mock.patch("os.environ.get", return_value="https://api.github.com/"):
self.gh_client = AsyncGitHubClient(log=app_log, client=self.http_client)
def _get_url(self):
"""Get the last url requested from the mock http client."""
args, kw = self.http_client.fetch.call_args
return args[0]
def assertStartsWith(self, string, beginning):
"""Assert that a url has the correct beginning.
Github API requests involve non-trivial query strings. This is useful
when you want to compare URLs, but don't care about the querystring.
"""
if string.startswith(beginning):
return
self.assertTrue(
string.startswith(beginning),
"%s does not start with %s" % (string, beginning),
)
def test_basic_fetch(self):
"""Test the mock http client is hit"""
self.gh_client.fetch("https://api.github.com/url")
self.assertTrue(self.http_client.fetch.called)
def test_fetch_params(self):
"""Test params are passed through."""
params = {"unique_param_name": 1}
self.gh_client.fetch("https://api.github.com/url", params=params)
url = self._get_url()
self.assertTrue("unique_param_name" in url)
def test_log_rate_limit(self):
pass
def test_get_repos(self):
self.gh_client.get_repos("username")
url = self._get_url()
self.assertStartsWith(url, "https://api.github.com/users/username/repos")
def test_get_contents(self):
user = "username"
repo = "my_awesome_repo"
path = u"möre-path"
self.gh_client.get_contents(user, repo, path)
url = self._get_url()
correct_url = u"https://api.github.com" + quote(
u"/repos/username/my_awesome_repo/contents/möre-path"
)
self.assertStartsWith(url, correct_url)
def test_get_branches(self):
user = "username"
repo = "my_awesome_repo"
self.gh_client.get_branches(user, repo)
url = self._get_url()
correct_url = "https://api.github.com/repos/username/my_awesome_repo/branches"
self.assertStartsWith(url, correct_url)
def test_get_tags(self):
user = "username"
repo = "my_awesome_repo"
self.gh_client.get_tags(user, repo)
url = self._get_url()
correct_url = "https://api.github.com/repos/username/my_awesome_repo/tags"
self.assertStartsWith(url, correct_url)
def test_get_tree(self):
user = "username"
repo = "my_awesome_repo"
path = "extra-path"
self.gh_client.get_tree(user, repo, path)
url = self._get_url()
correct_url = (
"https://api.github.com/repos/username/my_awesome_repo/git/trees/master"
)
self.assertStartsWith(url, correct_url)
def test_get_gist(self):
gist_id = "ap90avn23iovv2ovn2309n"
self.gh_client.get_gist(gist_id)
url = self._get_url()
correct_url = "https://api.github.com/gists/" + gist_id
self.assertStartsWith(url, correct_url)
def test_get_gists(self):
user = "username"
self.gh_client.get_gists(user)
url = self._get_url()
correct_url = "https://api.github.com/users/username/gists"
self.assertStartsWith(url, correct_url)

View File

@ -0,0 +1,164 @@
# coding: utf-8
# -----------------------------------------------------------------------------
# Copyright (C) Jupyter Development Team
#
# Distributed under the terms of the BSD License. The full license is in
# the file COPYING, distributed as part of this software.
# -----------------------------------------------------------------------------
import requests
from ....tests.base import FormatHTMLMixin
from ....tests.base import NBViewerTestCase
from ....tests.base import skip_unless_github_auth
class GitHubTestCase(NBViewerTestCase):
@skip_unless_github_auth
def ipython_example(self, *parts, **kwargs):
ref = kwargs.get("ref", "rel-2.0.0")
return self.url("github/ipython/ipython/blob/%s/examples" % ref, *parts)
@skip_unless_github_auth
def test_github(self):
url = self.ipython_example("Index.ipynb")
r = requests.get(url)
self.assertEqual(r.status_code, 200)
@skip_unless_github_auth
def test_github_unicode(self):
url = self.url(
"github/tlapicka/IPythonNotebooks/blob",
"ee6d2d13b96023e5f5e38e4516803eb22ede977e",
u"Matplotlib -- osy a mřížka.ipynb",
)
r = requests.get(url)
self.assertEqual(r.status_code, 200)
@skip_unless_github_auth
def test_github_blob_redirect_unicode(self):
url = self.url(
"/urls/github.com/tlapicka/IPythonNotebooks/blob",
"ee6d2d13b96023e5f5e38e4516803eb22ede977e",
u"Matplotlib -- osy a mřížka.ipynb",
)
r = requests.get(url)
self.assertEqual(r.status_code, 200)
# verify redirect
self.assertIn("/github/tlapicka/IPythonNotebooks/blob/", r.request.url)
@skip_unless_github_auth
def test_github_raw_redirect_unicode(self):
url = self.url(
"/url/raw.github.com/tlapicka/IPythonNotebooks",
"ee6d2d13b96023e5f5e38e4516803eb22ede977e",
u"Matplotlib -- osy a mřížka.ipynb",
)
r = requests.get(url)
self.assertEqual(r.status_code, 200)
# verify redirect
self.assertIn("/github/tlapicka/IPythonNotebooks/blob/", r.request.url)
@skip_unless_github_auth
def test_github_tag(self):
url = self.ipython_example("Index.ipynb", ref="rel-2.0.0")
r = requests.get(url)
self.assertEqual(r.status_code, 200)
@skip_unless_github_auth
def test_github_commit(self):
url = self.ipython_example(
"Index.ipynb", ref="7f5cbd622058396f1f33c4b26c8d205a8dd26d16"
)
r = requests.get(url)
self.assertEqual(r.status_code, 200)
@skip_unless_github_auth
def test_github_blob_redirect(self):
url = self.url(
"urls/github.com/ipython/ipython/blob/rel-2.0.0/examples", "Index.ipynb"
)
r = requests.get(url)
self.assertEqual(r.status_code, 200)
# verify redirect
self.assertIn("/github/ipython/ipython/blob/master", r.request.url)
@skip_unless_github_auth
def test_github_raw_redirect(self):
url = self.url(
"urls/raw.github.com/ipython/ipython/rel-2.0.0/examples", "Index.ipynb"
)
r = requests.get(url)
self.assertEqual(r.status_code, 200)
# verify redirect
self.assertIn("/github/ipython/ipython/blob/rel-2.0.0/examples", r.request.url)
@skip_unless_github_auth
def test_github_rawusercontent_redirect(self):
"""Test GitHub's new raw domain"""
url = self.url(
"urls/raw.githubusercontent.com/ipython/ipython/rel-2.0.0/examples",
"Index.ipynb",
)
r = requests.get(url)
self.assertEqual(r.status_code, 200)
# verify redirect
self.assertIn("/github/ipython/ipython/blob/rel-2.0.0/examples", r.request.url)
@skip_unless_github_auth
def test_github_raw_redirect_2(self):
"""test /url/github.com/u/r/raw/ redirects"""
url = self.url(
"url/github.com/ipython/ipython/blob/rel-2.0.0/examples", "Index.ipynb"
)
r = requests.get(url)
self.assertEqual(r.status_code, 200)
# verify redirect
self.assertIn("/github/ipython/ipython/blob/rel-2.0.0", r.request.url)
@skip_unless_github_auth
def test_github_repo_redirect(self):
url = self.url("github/ipython/ipython")
r = requests.get(url)
self.assertEqual(r.status_code, 200)
# verify redirect
self.assertIn("/github/ipython/ipython/tree/master", r.request.url)
@skip_unless_github_auth
def test_github_tree(self):
url = self.url("github/ipython/ipython/tree/rel-2.0.0/IPython/")
r = requests.get(url)
self.assertEqual(r.status_code, 200)
self.assertIn("__init__.py", r.text)
@skip_unless_github_auth
def test_github_tree_redirect(self):
url = self.url("github/ipython/ipython/tree/rel-2.0.0/MANIFEST.in")
r = requests.get(url)
self.assertEqual(r.status_code, 200)
# verify redirect
self.assertIn("/github/ipython/ipython/blob/rel-2.0.0", r.request.url)
self.assertIn("global-exclude", r.text)
@skip_unless_github_auth
def test_github_blob_redirect(self):
url = self.url("github/ipython/ipython/blob/rel-2.0.0/IPython")
r = requests.get(url)
self.assertEqual(r.status_code, 200)
# verify redirect
self.assertIn("/github/ipython/ipython/tree/rel-2.0.0/IPython", r.request.url)
self.assertIn("__init__.py", r.text)
@skip_unless_github_auth
def test_github_ref_list(self):
url = self.url("github/ipython/ipython/tree/master")
r = requests.get(url)
self.assertEqual(r.status_code, 200)
html = r.text
# verify branch is linked
self.assertIn("/github/ipython/ipython/tree/2.x/", html)
# verify tag is linked
self.assertIn("/github/ipython/ipython/tree/rel-2.3.0/", html)
class FormatHTMLGitHubTestCase(NBViewerTestCase, FormatHTMLMixin):
pass

View File

@ -0,0 +1,71 @@
# encoding: utf-8
import os
from unittest import TestCase
from ....utils import transform_ipynb_uri
from ..handlers import uri_rewrites
uri_rewrite_list = uri_rewrites()
class TestRewrite(TestCase):
def assert_rewrite(self, uri, rewrite):
new = transform_ipynb_uri(uri, uri_rewrite_list)
self.assertEqual(new, rewrite)
def assert_rewrite_ghe(self, uri, rewrite):
os.environ["GITHUB_API_URL"] = "https://example.com/api/v3/"
uri_rewrite_ghe_list = uri_rewrites()
os.environ.pop("GITHUB_API_URL", None)
new = transform_ipynb_uri(uri, uri_rewrite_ghe_list)
self.assertEqual(new, rewrite)
def test_githubusercontent(self):
uri = u"https://raw.githubusercontent.com/user/reopname/deadbeef/a mřížka.ipynb"
rewrite = u"/github/user/reopname/blob/deadbeef/a mřížka.ipynb"
self.assert_rewrite(uri, rewrite)
def test_blob(self):
uri = u"https://github.com/user/reopname/blob/deadbeef/a mřížka.ipynb"
rewrite = u"/github/user/reopname/blob/deadbeef/a mřížka.ipynb"
self.assert_rewrite(uri, rewrite)
def test_raw_uri(self):
uri = u"https://github.com/user/reopname/raw/deadbeef/a mřížka.ipynb"
rewrite = u"/github/user/reopname/blob/deadbeef/a mřížka.ipynb"
self.assert_rewrite(uri, rewrite)
def test_raw_subdomain(self):
uri = u"https://raw.github.com/user/reopname/deadbeef/a mřížka.ipynb"
rewrite = u"/github/user/reopname/blob/deadbeef/a mřížka.ipynb"
self.assert_rewrite(uri, rewrite)
def test_tree(self):
uri = u"https://github.com/user/reopname/tree/deadbeef/a mřížka.ipynb"
rewrite = u"/github/user/reopname/tree/deadbeef/a mřížka.ipynb"
self.assert_rewrite(uri, rewrite)
def test_userrepo(self):
uri = u"username/reponame"
rewrite = u"/github/username/reponame/tree/master/"
self.assert_rewrite(uri, rewrite)
def test_user(self):
uri = u"username"
rewrite = u"/github/username/"
self.assert_rewrite(uri, rewrite)
def test_ghe_blob(self):
uri = u"https://example.com/user/reopname/blob/deadbeef/a mřížka.ipynb"
rewrite = u"/github/user/reopname/blob/deadbeef/a mřížka.ipynb"
self.assert_rewrite_ghe(uri, rewrite)
def test_ghe_raw_uri(self):
uri = u"https://example.com/user/reopname/raw/deadbeef/a mřížka.ipynb"
rewrite = u"/github/user/reopname/blob/deadbeef/a mřížka.ipynb"
self.assert_rewrite_ghe(uri, rewrite)
def test_ghe_tree(self):
uri = u"https://example.com/user/reopname/tree/deadbeef/a mřížka.ipynb"
rewrite = u"/github/user/reopname/tree/deadbeef/a mřížka.ipynb"
self.assert_rewrite_ghe(uri, rewrite)

View File

@ -0,0 +1,2 @@
from .handlers import default_handlers
from .handlers import LocalFileHandler

View File

@ -0,0 +1,298 @@
# -----------------------------------------------------------------------------
# Copyright (C) Jupyter Development Team
#
# Distributed under the terms of the BSD License. The full license is in
# the file COPYING, distributed as part of this software.
# -----------------------------------------------------------------------------
import errno
import io
import os
import stat
from datetime import datetime
from tornado import iostream
from tornado import web
from .. import _load_handler_from_location
from ...utils import url_path_join
from ..base import cached
from ..base import RenderingHandler
class LocalFileHandler(RenderingHandler):
"""Renderer for /localfile
Serving notebooks from the local filesystem
"""
# cache key is full uri to avoid mixing download vs view paths
_cache_key_attr = "uri"
# provider root path
_localfile_path = "/localfile"
@property
def localfile_path(self):
if self.settings.get("localfile_follow_symlinks"):
return os.path.realpath(self.settings.get("localfile_path", ""))
else:
return os.path.abspath(self.settings.get("localfile_path", ""))
def breadcrumbs(self, path):
"""Build a list of breadcrumbs leading up to and including the
given local path.
Parameters
----------
path: str
Relative path up to and including the leaf directory or file to include
in the breadcrumbs list
Returns
-------
list
Breadcrumbs suitable for the link_breadcrumbs() jinja macro
"""
breadcrumbs = [
{"url": url_path_join(self.base_url, self._localfile_path), "name": "home"}
]
breadcrumbs.extend(super().breadcrumbs(path, self._localfile_path))
return breadcrumbs
async def download(self, fullpath):
"""Download the file at the given absolute path.
Parameters
==========
fullpath: str
Absolute path to the file
"""
filename = os.path.basename(fullpath)
st = os.stat(fullpath)
self.set_header("Content-Length", st.st_size)
# Escape commas to workaround Chrome issue with commas in download filenames
self.set_header(
"Content-Disposition",
"attachment; filename={};".format(filename.replace(",", "_")),
)
content = web.StaticFileHandler.get_content(fullpath)
if isinstance(content, bytes):
content = [content]
for chunk in content:
try:
self.write(chunk)
await self.flush()
except iostream.StreamClosedError:
return
def can_show(self, path):
"""
Generally determine whether the given path is displayable.
This function is useful for failing fast - further checks may
be applied at notebook render to confirm a file may be shown.
"""
if self.settings.get("localfile_follow_symlinks"):
fullpath = os.path.realpath(os.path.join(self.localfile_path, path))
else:
fullpath = os.path.abspath(
os.path.normpath(os.path.join(self.localfile_path, path))
)
if not fullpath.startswith(self.localfile_path):
self.log.warn("Directory traversal attempt: '%s'" % fullpath)
return False
if not os.path.exists(fullpath):
self.log.warn("Path: '%s' does not exist", fullpath)
return False
if any(
part.startswith(".") or part.startswith("_")
for part in fullpath.split(os.sep)
):
return False
if not self.settings.get("localfile_any_user"):
fstat = os.stat(fullpath)
# Ensure the file/directory has other read access for all.
if not fstat.st_mode & stat.S_IROTH:
self.log.warn("Path: '%s' does not have read permissions", fullpath)
return False
if os.path.isdir(fullpath) and not fstat.st_mode & stat.S_IXOTH:
# skip directories we can't execute (i.e. list)
self.log.warn("Path: '%s' does not have execute permissions", fullpath)
return False
return True
async def get_notebook_data(self, path):
fullpath = os.path.join(self.localfile_path, path)
if not self.can_show(fullpath):
self.log.info("Path: '%s' is not visible from within nbviewer", fullpath)
raise web.HTTPError(404)
if os.path.isdir(fullpath):
html = self.show_dir(fullpath, path)
await self.cache_and_finish(html)
return
is_download = self.get_query_arguments("download")
if is_download:
await self.download(fullpath)
return
return fullpath
async def deliver_notebook(self, fullpath, path):
try:
with io.open(fullpath, encoding="utf-8") as f:
nbdata = f.read()
except IOError as ex:
if ex.errno == errno.EACCES:
# py3: can't read the file, so don't give away it exists
self.log.info(
"Path : '%s' is not readable from within nbviewer", fullpath
)
raise web.HTTPError(404)
raise ex
# Explanation of some kwargs passed into `finish_notebook`:
# breadcrumbs: list of dict
# Breadcrumb 'name' and 'url' to render as links at the top of the notebook page
# title: str
# Title to use as the HTML page title (i.e., text on the browser tab)
await self.finish_notebook(
nbdata,
download_url="?download",
msg="file from localfile: %s" % path,
public=False,
breadcrumbs=self.breadcrumbs(path),
title=os.path.basename(path),
)
@cached
async def get(self, path):
"""Get a directory listing, rendered notebook, or raw file
at the given path based on the type and URL query parameters.
If the path points to an accessible directory, render its contents.
If the path points to an accessible notebook file, render it.
If the path points to an accessible file and the URL contains a
'download' query parameter, respond with the file as a download.
Parameters
==========
path: str
Local filesystem path
"""
fullpath = await self.get_notebook_data(path)
# get_notebook_data returns None if a directory is to be shown or a notebook is to be downloaded,
# i.e. if no notebook is supposed to be rendered, making deliver_notebook inappropriate
if fullpath:
await self.deliver_notebook(fullpath, path)
# Make available to increase modularity for subclassing
# E.g. so subclasses can implement templates with custom logic
# without having to copy-paste the entire show_dir method
def render_dirview_template(self, entries, breadcrumbs, title, **namespace):
"""
breadcrumbs: list of dict
Breadcrumb 'name' and 'url' to render as links at the top of the notebook page
title: str
Title to use as the HTML page title (i.e., text on the browser tab)
"""
return self.render_template(
"dirview.html",
entries=entries,
breadcrumbs=breadcrumbs,
title=title,
**namespace
)
def show_dir(self, fullpath, path, **namespace):
"""Render the directory view template for a given filesystem path.
Parameters
==========
fullpath: string
Absolute path on disk to show
path: string
URL path equating to the path on disk
Returns
=======
str
Rendered HTML
"""
entries = []
dirs = []
ipynbs = []
try:
contents = os.listdir(fullpath)
except IOError as ex:
if ex.errno == errno.EACCES:
# can't access the dir, so don't give away its presence
self.log.info(
"Contents of path: '%s' cannot be listed from within nbviewer",
fullpath,
)
raise web.HTTPError(404)
for f in contents:
absf = os.path.join(fullpath, f)
if not self.can_show(absf):
continue
entry = {}
entry["name"] = f
# We need to make UTC timestamps conform to true ISO-8601 by
# appending Z(ulu). Without a timezone, the spec says it should be
# treated as local time which is not what we want and causes
# moment.js on the frontend to show times in the past or future
# depending on the user's timezone.
# https://en.wikipedia.org/wiki/ISO_8601#Time_zone_designators
if os.path.isdir(absf):
st = os.stat(absf)
dt = datetime.utcfromtimestamp(st.st_mtime)
entry["modtime"] = dt.isoformat() + "Z"
entry["url"] = url_path_join(self._localfile_path, path, f)
entry["class"] = "fa fa-folder-open"
dirs.append(entry)
elif f.endswith(".ipynb"):
st = os.stat(absf)
dt = datetime.utcfromtimestamp(st.st_mtime)
entry["modtime"] = dt.isoformat() + "Z"
entry["url"] = url_path_join(self._localfile_path, path, f)
entry["class"] = "fa fa-book"
ipynbs.append(entry)
dirs.sort(key=lambda e: e["name"])
ipynbs.sort(key=lambda e: e["name"])
entries.extend(dirs)
entries.extend(ipynbs)
html = self.render_dirview_template(
entries=entries,
breadcrumbs=self.breadcrumbs(path),
title=url_path_join(path, "/"),
**namespace
)
return html
def default_handlers(handlers=[], **handler_names):
"""Tornado handlers"""
local_handler = _load_handler_from_location(handler_names["local_handler"])
return handlers + [(r"/localfile/?(.*)", local_handler, {})]

View File

@ -0,0 +1,50 @@
# -----------------------------------------------------------------------------
# Copyright (C) Jupyter Development Team
#
# Distributed under the terms of the BSD License. The full license is in
# the file COPYING, distributed as part of this software.
# -----------------------------------------------------------------------------
import requests
from ....tests.base import FormatHTMLMixin
from ....tests.base import NBViewerTestCase
class LocalFileDefaultTestCase(NBViewerTestCase):
@classmethod
def get_server_cmd(cls):
return super().get_server_cmd() + ["--localfiles=."]
def test_url(self):
## assumes being run from base of this repo
url = self.url("localfile/nbviewer/tests/notebook.ipynb")
r = requests.get(url)
self.assertEqual(r.status_code, 200)
class FormatHTMLLocalFileDefaultTestCase(LocalFileDefaultTestCase, FormatHTMLMixin):
pass
class LocalFileRelativePathTestCase(NBViewerTestCase):
@classmethod
def get_server_cmd(cls):
return super().get_server_cmd() + ["--localfiles=nbviewer"]
def test_url(self):
## assumes being run from base of this repo
url = self.url("localfile/tests/notebook.ipynb")
r = requests.get(url)
self.assertEqual(r.status_code, 200)
def test_404(self):
## assumes being run from base of this repo
url = self.url("localfile/doesntexist")
r = requests.get(url)
self.assertEqual(r.status_code, 404)
class FormatHTMLLocalFileRelativePathTestCase(
LocalFileRelativePathTestCase, FormatHTMLMixin
):
pass

View File

@ -0,0 +1,2 @@
from .handlers import default_handlers
from .handlers import uri_rewrites

View File

@ -0,0 +1,104 @@
# -----------------------------------------------------------------------------
# Copyright (C) Jupyter Development Team
#
# Distributed under the terms of the BSD License. The full license is in
# the file COPYING, distributed as part of this software.
# -----------------------------------------------------------------------------
from urllib import robotparser
from urllib.parse import urlparse
from tornado import httpclient
from tornado import web
from tornado.escape import url_unescape
from .. import _load_handler_from_location
from ...utils import quote
from ...utils import response_text
from ..base import cached
from ..base import RenderingHandler
class URLHandler(RenderingHandler):
"""Renderer for /url or /urls"""
async def get_notebook_data(self, secure, netloc, url):
proto = "http" + secure
netloc = url_unescape(netloc)
if "/?" in url:
url, query = url.rsplit("/?", 1)
else:
query = None
remote_url = u"{}://{}/{}".format(proto, netloc, quote(url))
if query:
remote_url = remote_url + "?" + query
if not url.endswith(".ipynb"):
# this is how we handle relative links (files/ URLs) in notebooks
# if it's not a .ipynb URL and it is a link from a notebook,
# redirect to the original URL rather than trying to render it as a notebook
refer_url = self.request.headers.get("Referer", "").split("://")[-1]
if refer_url.startswith(self.request.host + "/url"):
self.redirect(remote_url)
return
parse_result = urlparse(remote_url)
robots_url = parse_result.scheme + "://" + parse_result.netloc + "/robots.txt"
public = False # Assume non-public
try:
robots_response = await self.fetch(robots_url)
robotstxt = response_text(robots_response)
rfp = robotparser.RobotFileParser()
rfp.set_url(robots_url)
rfp.parse(robotstxt.splitlines())
public = rfp.can_fetch("*", remote_url)
except httpclient.HTTPError as e:
self.log.debug(
"Robots.txt not available for {}".format(remote_url), exc_info=True
)
public = True
except Exception as e:
self.log.error(e)
return remote_url, public
async def deliver_notebook(self, remote_url, public):
response = await self.fetch(remote_url)
try:
nbjson = response_text(response, encoding="utf-8")
except UnicodeDecodeError:
self.log.error("Notebook is not utf8: %s", remote_url, exc_info=True)
raise web.HTTPError(400)
await self.finish_notebook(
nbjson,
download_url=remote_url,
msg="file from url: %s" % remote_url,
public=public,
request=self.request,
)
@cached
async def get(self, secure, netloc, url):
remote_url, public = await self.get_notebook_data(secure, netloc, url)
await self.deliver_notebook(remote_url, public)
def default_handlers(handlers=[], **handler_names):
"""Tornado handlers"""
url_handler = _load_handler_from_location(handler_names["url_handler"])
return handlers + [
(r"/url(?P<secure>[s]?)/(?P<netloc>[^/]+)/(?P<url>.*)", url_handler, {})
]
def uri_rewrites(rewrites=[]):
return rewrites + [("^http(s?)://(.*)$", u"/url{0}/{1}"), ("^(.*)$", u"/url/{0}")]

View File

@ -0,0 +1,15 @@
# -*- coding: utf-8 -*-
import requests
from ....tests.base import NBViewerTestCase
class ForceUTF8TestCase(NBViewerTestCase):
def test_utf8(self):
""" #507, bitbucket returns no content headers, but _is_ serving utf-8
"""
response = requests.get(
self.url("/urls/bitbucket.org/sandiego206/asdasd/raw/master/Untitled.ipynb")
)
self.assertEqual(response.status_code, 200)
self.assertIn("ñ", response.content)

View File

@ -0,0 +1,38 @@
# -----------------------------------------------------------------------------
# Copyright (C) Jupyter Development Team
#
# Distributed under the terms of the BSD License. The full license is in
# the file COPYING, distributed as part of this software.
# -----------------------------------------------------------------------------
import unittest
import requests
from ....tests.base import FormatHTMLMixin
from ....tests.base import NBViewerTestCase
class URLTestCase(NBViewerTestCase):
def test_url(self):
url = self.url("url/jdj.mit.edu/~stevenj/IJulia Preview.ipynb")
r = requests.get(url)
# Base class overrides assertIn to do unicode in unicode checking
# We want to use the original unittest implementation
unittest.TestCase.assertIn(self, r.status_code, (200, 202))
self.assertIn("Download Notebook", r.text)
def test_urls_with_querystring(self):
# This notebook is only available if the querystring is passed through.
# Notebook URL: https://bug1348008.bmoattachments.org/attachment.cgi?id=8860059
url = self.url(
"urls/bug1348008.bmoattachments.org/attachment.cgi/%3Fid%3D8860059"
)
r = requests.get(url)
# Base class overrides assertIn to do unicode in unicode checking
# We want to use the original unittest implementation
unittest.TestCase.assertIn(self, r.status_code, (200, 202))
self.assertIn("Download Notebook", r.text)
class FormatHTMLURLTestCase(URLTestCase, FormatHTMLMixin):
pass

View File

@ -0,0 +1,64 @@
"""Object for tracking rate-limited requests"""
# Copyright (c) Jupyter Development Team.
# Distributed under the terms of the Modified BSD License.
import hashlib
from tornado.log import app_log
from tornado.web import HTTPError
class RateLimiter(object):
"""Rate limit checking object"""
def __init__(self, limit, interval, cache):
self.limit = limit
self.interval = interval
self.cache = cache
def key_for_handler(self, handler):
"""Identify a visitor.
Currently combine ip + user-agent.
We don't need to be perfect.
"""
agent = handler.request.headers.get("User-Agent", "")
return "rate-limit:{}:{}".format(
handler.request.remote_ip,
hashlib.md5(agent.encode("utf8", "replace")).hexdigest(),
)
async def check(self, handler):
"""Check the rate limit for a handler.
Identifies the source by ip and user-agent.
If the rate limit is exceeded, raise HTTPError(429)
"""
if not self.limit:
return
key = self.key_for_handler(handler)
added = await self.cache.add(key, 1, self.interval)
if not added:
# it's been seen before, use incr
try:
count = await self.cache.incr(key)
except Exception as e:
app_log.warning("Failed to increment rate limit for %s", key)
return
app_log.debug(
"Rate limit remaining for %r: %s/%s",
key,
self.limit - count,
self.limit,
)
if count and count >= self.limit:
minutes = self.interval // 60
raise HTTPError(
429,
"Rate limit exceeded for {ip} ({limit} req / {minutes} min)."
" Try again later.".format(
ip=handler.request.remote_ip, limit=self.limit, minutes=minutes
),
)

View File

@ -0,0 +1,63 @@
# -----------------------------------------------------------------------------
# Copyright (C) Jupyter Development Team
#
# Distributed under the terms of the BSD License. The full license is in
# the file COPYING, distributed as part of this software.
# -----------------------------------------------------------------------------
from nbconvert.exporters import Exporter
from tornado.log import app_log
# -----------------------------------------------------------------------------
#
# -----------------------------------------------------------------------------
class NbFormatError(Exception):
pass
exporters = {}
def render_notebook(format, nb, url=None, forced_theme=None, config=None):
exporter = format["exporter"]
if not isinstance(exporter, Exporter):
# allow exporter to be passed as a class, rather than instance
# because Exporter instances cannot be passed across multiprocessing boundaries
# instances are cached by class to avoid repeated instantiation of duplicates
exporter_cls = exporter
if exporter_cls not in exporters:
app_log.info("instantiating %s" % exporter_cls.__name__)
exporters[exporter_cls] = exporter_cls(config=config, log=app_log)
exporter = exporters[exporter_cls]
css_theme = nb.get("metadata", {}).get("_nbviewer", {}).get("css", None)
if not css_theme or not css_theme.strip():
# whitespace
css_theme = None
if forced_theme:
css_theme = forced_theme
# get the notebook title, if any
try:
name = nb.metadata.name
except AttributeError:
name = ""
if not name and url is not None:
name = url.rsplit("/")[-1]
if not name.endswith(".ipynb"):
name = name + ".ipynb"
html, resources = exporter.from_notebook_node(nb)
if "postprocess" in format:
html, resources = format["postprocess"](html, resources)
config = {"download_name": name, "css_theme": css_theme}
return html, config

View File

@ -0,0 +1 @@
{"directory": "components"}

View File

@ -0,0 +1,14 @@
{
"name": "static",
"version": "0.0.0",
"dependencies": {
"animate.css": "~3.2",
"headroom.js": "0.7.0",
"requirejs": "~2.1",
"moment": "~2.8.4",
"bootstrap": "components/bootstrap#~3.3",
"font-awesome": "~4",
"pygments": "~2.0.0",
"reveal.js": "~3.1.0"
}
}

View File

@ -0,0 +1,39 @@
@font-face {
font-family: "Computer Modern";
src: url('http://mirrors.ctan.org/fonts/cm-unicode/fonts/otf/cmunss.otf');
}
div.cell{
width:800px;
margin-left:auto;
margin-right:auto;
}
h1 {
font-family: "Charis SIL", Palatino, serif;
}
div.text_cell_render{
font-family: Computer Modern, "Helvetica Neue", Arial, Helvetica, Geneva, sans-serif;
line-height: 145%;
font-size: 120%;
width:800px;
margin-left:auto;
margin-right:auto;
}
.CodeMirror{
font-family: "Source Code Pro", source-code-pro,Consolas, monospace;
}
.prompt{
display: None;
}
.text_cell_render h5 {
font-weight: 300;
font-size: 16pt;
color: #4057A1;
font-style: italic;
margin-bottom: .5em;
margin-top: 0.5em;
display: block;
}
.warning{
color: rgb( 240, 20, 20 )
}

View File

@ -0,0 +1,173 @@
div.cell {
width: inherit ;
background-color: #f3f3f3 ;
}
.container {
max-width:50em;
}
/* block fixes margin on input boxes */
/* in firefox */
.input.hbox {
max-width: 50em;
}
.input_area {
background-color: white ;
}
.output_area pre {
font-family: "Source Code Pro", source-code-pro, Consolas, monospace;
border: 0px;
}
div.output_text {
font-family: "Source Code Pro", source-code-pro, Consolas, monospace;
}
div.text_cell {
max-width: 35em;
text-align: left;
}
div.prompt {
width: 0px;
visibility: hidden ;
}
.code_cell {
background-color: #f3f3f3;
}
.highlight {
background-color: #ffffff;
}
div.input_prompt {
visibility: hidden;
width: 0 ;
}
div.text_cell_render {
font-family: "Minion Pro", "minion-pro", "Charis SIL", Palatino, serif ;
font-size: 14pt ;
line-height: 145% ;
max-width: 35em ;
text-align: left ;
background-color: #f3f3f3 ;
}
div.text_cell_render h1 {
display: block;
font-size: 28pt;
color: #3B3B3B;
margin-bottom: 0em;
margin-top: 0.5em;
}
.rendered_html li {
margin-bottom: .25em;
color: #3B3B3B;;
}
div.text_cell_render h2:before {
content: "\2FFA";
margin-right: 0.5em;
font-size: .5em;
vertical-align: baseline;
border-top: 1px;
}
.hiterm {
font-weight: 500;
color: #DC143C;
}
.text_cell_render h2 {
font-size: 20pt;
margin-bottom: 0em;
margin-top: 0.5em;
display: block;
color: #3B3B3B;
}
.MathJax_Display {
/*text-align: center ;*/
margin-left: 2em ;
margin-top: .5em ;
margin-bottom: .5em ;
}
.text_cell_render h3 {
font-size: 14pt;
font-weight: 600;
font-style: italic;
margin-bottom: -0.5em;
margin-top: -0.25em;
color: #3B3B3B;
text-indent: 2em;
}
.text_cell_render h5 {
font-weight: 300;
font-size: 14pt;
color: #4057A1;
font-style: italic;
margin-bottom: .5em;
margin-top: 0.5em;
display: block;
}
.CodeMirror {
font-family: "Source Code Pro", source-code-pro, Consolas, monospace ;
font-size: 10pt;
background: #fffffe; /* #f0f8fb #e2eef9*/
border: 0px;
}
.rendered_html {
}
.rendered_html code {
font-family: "Source Code Pro", source-code-pro,Consolas, monospace;
font-size: 85%;
}
pre, code, kbd, samp { font-family: "Source Code Pro", source-code-pro, Consola, monospace; }
.rendered_html p {
text-align: left;
color: #3B3B3B;
margin-bottom: .5em;
}
.rendered_html p+p {
text-indent: 1em;
margin-top: 0;
}
.rendered_html ol {
list-style: decimal;
/*margin: 1em 2em;*/
}
.rendered_html ol ol {
list-style: decimal;
}
.rendered_html ol ol ol {
list-style: decimal;
}
body{background-color:#f3f3f3;}
.rendered_html p.hangpar {
text-indent: 0;
}
</style>

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.8 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.3 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.1 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 475 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 960 B

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.9 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 59 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 50 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 200 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 70 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 95 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 38 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 72 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 72 KiB

Some files were not shown because too many files have changed in this diff Show More