.. index::
    single: PasswordHash interface
    single: custom hash handler; requirements

.. _hash-tutorial:

.. currentmodule:: passlib.ifc

===========================================
:class:`~passlib.ifc.PasswordHash` Tutorial
===========================================

Overview
========
Passlib supports a large number of hash algorithms,
all of which can be imported from the :mod:`passlib.hash` module.
While the exact options and behavior will vary between each algorithm,
all of the hashes provided by Passlib use the same interface,
defined by the :class:`passlib.ifc.PasswordHash` abstract class.

The :class:`!PasswordHash` class provides a generic interface for interacting
individually with the various hashing algorithms.
It offers methods and attributes for a number of use-cases,
which fall into three general categories:

   * Creating & verifying hashes

   * Examining the configuration of a hasher,
     and customizing the defaults.

   * Assorting supplementary methods.

.. seealso::

   * :mod:`passlib.ifc` -- API reference of all the methods and attributes
     of the :class:`!PasswordHash` class.

   * :ref:`passlib.context.CryptContext <context-tutorial>` --
     For working with multiple hash formats at once
     (such a user account table with multiple existing hash formats).

.. _hash-verifying:
.. _password-hash-examples:

Hashing & Verifying
===================
While all the hashers in :mod:`passlib.hash` offer a range of methods and attributes,
the main activities applications will need to perform is hashing and verifying passwords.
This can be done with the :meth:`PasswordHash.hash` and :meth:`PasswordHash.verify` methods.

.. rst-class:: float-center without-title

.. caution::

   **Changed in 1.7:**

   Prior releases used :meth:`PasswordHash.encrypt` for hashing,
   which has now been renamed to :meth:`PasswordHash.hash`.
   A compatibility alias is present in 1.7, but will be removed in Passlib 2.0.

Hashing
-------
First, import the desired hash.  The following example uses the :class:`~passlib.hash.pbkdf2_sha256` class
(which derives from :class:`!PasswordHash`)::

    >>> # import the desired hasher
    >>> from passlib.hash import pbkdf2_sha256

Use :meth:`PasswordHash.hash` to hash a password.  This call takes care of unicode encoding,
picking default rounds values, and generating a random salt::

    >>> hash = pbkdf2_sha256.hash("password")
    >>> hash
    '$pbkdf2-sha256$29000$9t7be09prfXee2/NOUeotQ$Y.RDnnq8vsezSZSKy1QNy6xhKPdoBIwc.0XDdRm9sJ8'

Note that since each call generates a new salt, the contents of the resulting
hash will differ between calls (despite using the same password as input)::

    >>> hash2 = pbkdf2_sha256.hash("password")
    >>> hash2
    '$pbkdf2-sha256$29000$V0rJeS.FcO4dw/h/D6E0Bg$FyLs7omUppxzXkARJQSl.ozcEOhgp3tNgNsKIAhKmp8'
                          ^^^^^^^^^^^^^^^^^^^^^^

Verifying
---------
Subsequently, you can call :meth:`PasswordHash.verify` to check user input
against an existing hash::

    >>> pbkdf2_sha256.verify("password", hash)
    True

    >>> pbkdf2_sha256.verify("joshua", hash)
    False

.. _hash-unicode-behavior:

Unicode & non-ASCII Characters
------------------------------
*Sidenote regarding unicode passwords & non-ASCII characters:*

For the majority of hash algorithms and use-cases, passwords should
be provided as either :class:`!unicode` (or ``utf-8``-encoded :class:`!bytes`).

One exception is legacy hashes that were generated
using a different character encoding. In this case, passwords should be
encoded using the correct encoding before they are passed to :meth:`!verify`;
otherwise users may not be able to log in successfully.

For proper internationalization, applications should also take care to ensure
unicode inputs are normalized to a single representation before hashing.
The :func:`passlib.utils.saslprep` function can be used for this purpose.

.. _hash-configuring:

Customizing the Configuration
=============================

The using() Method
------------------
Each hasher contains a number of :ref:`informational attributes <informational-attributes>`.
many of which can be customized to change the properties of the hashes
generated by :meth:`PasswordHash.hash`.  When you want to change the defaults,
you don't have to modify the hasher class directly, or pass in the options to each call to :meth:`!PasswordHash.hash`.

Instead, all the hashes offer a :meth:`PasswordHash.using` method.
This is a powerful method which accepts most hash informational attributes,
as well as some other hash-specific configuration keywords; and returns
a subclass of the original hasher (or a object with an identical interface).
The returned object inherits the defaults settings from it's parent,
but integrates any values you choose to override.

.. rst-class:: float-center without-title

.. caution::

   **Changed in 1.7:**

   Prior releases required you to pass custom settings to each :meth:`PasswordHash.encrypt` call.
   That usage pattern is deprecated, and will be removed in Passlib 2.0;
   code should be switched to use :meth:`PasswordHash.using`, as shown below.

Usage Example
-------------
As an example, if the hasher you select supports a variable number of iterations
(such as :class:`~passlib.hash.pbkdf2_sha256`), you can specify a custom value
using the ``rounds`` keyword.

Here, the default class uses 29000 rounds::

    >>> from passlib.hash import pbkdf2_sha256

    >>> pbkdf2_sha256.default_rounds
    29000

    >>> pbkdf2_sha256.hash("password")
    '$pbkdf2-sha256$29000$V0rJeS.FcO4dw/h/D6E0Bg$FyLs7omUppxzXkARJQSl.ozcEOhgp3tNgNsKIAhKmp8'
                    ^^^^^

But if we call :meth:`PasswordHash.using`, we can override this value::

    >>> custom_pbkdf2 = pbkdf2_sha256.using(rounds=123456)
    >>> custom_pbkdf2.default_rounds
    123456

    >>> custom_pbkdf2.hash("password")
    '$pbkdf2-sha256$123456$QwjBmJPSOsf4HyNE6L239g$8m1pnP69EYeOiKKb5sNSiYw9M8pJMyeW.CSm0KKO.GI'
                    ^^^^^^

Other Keywords
--------------
While hashes frequently have additional keywords supported by using,
the basic set of settings you can customize can be found by inspecting
the :attr:`PasswordHash.setting_kwds` attribute::

    >>> pbkdf2_sha256.settings_kwds
    ("salt", "salt_size", "rounds")

For instance, the following generates pbkdf2 hashes with a 32-byte salt
instead of the default 16::

    >>> pbkdf2_sha256.using(salt_size=8).hash("password")
    '$pbkdf2-sha256$29000$tPZ.r5UyZgyhNEaI8Z5z7r1X6p1zTknJ.T/nHINwbq0$RlM49Qf5qRraHx.L7gq3hKIKSMLttrG1zWmWXyfXqc8'
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This method is also used internally by the :ref:`CryptContext <context-tutorial>`
class it order to create a custom hasher configured based on the CryptContext policy
it was provided.

.. seealso::

    * :meth:`PasswordHash.using` -- API reference

Context Keywords
================
While the :meth:`PasswordHash.hash` example above works for most hashes,
a small number of algorithms require you provide external data
(such as a username) every time a hash is calculated.

An example of this is the :class:`~passlib.hash.oracle10` hash,
where hashing requires a username::

    >>> from passlib.hash import oracle10
    >>> hash = oracle10.hash("secret", user="admin")
    'B858CE295C95193F'

The difference between this and specifying something like a rounds setting
(see :ref:`hash-configuring` above) is that a configuration option
only needs to be specified once, and is then encoded into the hash string itself...
Whereas a context keyword represents something that isn't stored in the hash string,
and needs to be specified every time you call :meth:`PasswordHash.hash` **or**
:meth:`PasswordHash.verify`::

    >>> oracle10.verify("secret", hash, user="admin")
    True

In this example, if either the username OR password is wrong,
verify() will fail::

    >>> oracle10.verify("secret", hash, user="wronguser")
    False

    >>> oracle10.verify("wrongpassword", hash, user="admin")
    False

Forgetting to include a context keywords when it's required will cause a TypeError::

    >>> hash = oracle10.hash("password")
    Traceback (most recent call last):
        <traceback omitted>
    TypeError: user must be unicode or bytes, not None

Whether a hash requires external parameters (such as ``user``)
can be determined from its documentation page; but also programmatically from
its :attr:`PasswordHash.context_kwds` attribute::

    >>> oracle10.context_kwds
    ("user",)

    >>> pbkdf2_sha256.context_kwds
    ()

Identifying Hashes
==================
One of the rarer use-cases is the need to identify whether a string
recognizably belongs to a given hasher class.  This can be important
in some cases, because attempting to call :meth:`PasswordHash.verify`
with another algorithm's hash will result in a ValueError::

    >>> from passlib.hash import pbkdf2_sha256, md5_crypt

    >>> other_hash = md5_crypt.hash("password")

    >>> pbkdf2_sha256.verify("password", other_hash)
    Traceback (most recent call last):
        <traceback omitted>
    ValueError: not a valid pbkdf2_sha256 hash

This can be prevented by using the identify method,
which determines whether a hash belongs to a given algorithm::

    >>> hash = pbkdf2_sha256.hash("password")
    >>> pbkdf2_sha256.identify(hash)
    True

    >>> pbkdf2_sha256.identify(other_hash)
    False

.. rst-class:: float-center

.. seealso::

    In most cases where an application needs to
    distinguish between multiple hash formats, it will be more useful to switch to
    a  :ref:`CryptContext <context-tutorial>` object, which automatically handles this
    and many similar tasks.

.. todo::

    Document usage of :meth:`PasswordHash.needs_update`,
    and how it ties into :meth:`PasswordHash.using`.

.. index:: rounds; choosing the right value

.. _rounds-selection-guidelines:

Choosing the right rounds value
===============================
For hash algorithms with a variable time-cost,
Passlib's :attr:`PasswordHash.default_rounds` values attempt to be secure enough for
the average [#avgsys]_ system. But the "right" value for a given hash
is dependant on the server, its cpu, its expected load, and its users.
Since larger values mean increased work for an attacker...

.. centered::
    The right ``rounds`` value for a given hash & server should be the largest
    possible value that doesn't cause intolerable delay for your users.

For most public facing services, you can generally have signin
take upwards of 250ms - 400ms before users start getting annoyed.
For superuser accounts, it should take as much time as the admin can stand
(usually ~4x more delay than a regular account).

Passlib's :attr:`!default_rounds` values are retuned periodically,
starting with a rough estimate of what an "average" system is capable of,
and then setting all :samp:`{hash}.default_rounds` values to take ~300ms on such a system.
However, some older algorithms (e.g. :class:`~passlib.hash.bsdi_crypt`) are weak enough that
a tradeoff must be made, choosing "more secure but intolerably slow" over "fast but unacceptably insecure".

For this reason, it is strongly recommended to not use a value much lower than Passlib's default,
and to use one of :ref:`recommended hashes <recommended-hashes>`, as one of their chief qualifying
features is the mere *existence* of rounds values which take a short enough amount of time,
and yet are still considered secure.

.. todo::

    Expand this section into a full document, including
    information from the following posts:

    * http://stackoverflow.com/questions/13545677/python-passlib-what-is-the-best-value-for-rounds
    * http://stackoverflow.com/questions/11829602/pbkdf2-and-hash-comparison

    As well as maybe JS-interactive calculation helper.


.. [#avgsys] For Passlib 1.6.3, all hashes were retuned to take ~300ms on a
   system with a 3.0 ghz 64 bit CPU.

